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POLYCYSTIC KIDNEY DISEASE GENE HOMOLOGS REQUIRED FOR MALE 
MATING BEHAVIOR IN NEMATODES AND ASSAYS BASED THEREON 

RELATED APPLICATIONS 

For U.S. purposes, benefit of priority under 35 U.S.C. §1 19(e) to 
5 U.S. Provisional Application Serial No. 60/115,127, entitled 

" CAENORHABDITIS ELEGANS STRAINS PERTURBED IN POLYCYSTIN 
FUNCTION" to Paul W. Sternberg and Maureen M. Barr, filed January 6, 
1999, is claimed herein. The subject matter of U.S. Provisional 
Application Serial No. 60/115,127 is incorporated in its entirety by 

10 reference. 

FIELD OF INVENTION 

Systems and assays for identification of compounds that can be 
used to treat polycystic kidney disease (PKD) are provided. Nematode 
orthologs of genes involved in PKD are identified and associated with 

15 mating behaviors. In particular, nematodes, such as Caenorhabditis 
elegans, that express mutant and wild-type orthologs of human genes 
involved in this disease, are used to study the functions of the proteins 
encoded by the genes, to screen for other genes involved in the disease, 
to identify mutations involved in the disease, and to screen for drugs that 

20 affect PKD. Hence an animal model is provided that permits study of the 
etiology of polycystic kidney disease and provides a tool to identify the 
genes and factors involved in the disease pathway, and to identify 
compounds that may be used to treat or alter the disease progression, 
lessen its severity or ameliorate symptoms. 

25 BACKGROUND 

Polycystic Kidney Diseases 

Polycystic kidney diseases (PKD) are a group of disorders 
characterized by the presence of a large number of fluid-filled cysts 
throughout grossly enlarged kidneys (Gabow eta/, (1992) Diseases of the 
30 Kidney, Schrier eta/., eds.). In humans, PKDs can be inherited in 

autosomal dominant (ADPKD) or autosomal recessive (ARPKD) forms. 
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ADPKD is the more common form and is the most common, dominantly- 
inherited kidney disease in humans, occurring at a frequency of about 1 in 
800. ARPKD occurs at a frequency of about 1 in 10,000. 

ADPKD is the most common single-gene disorder leading to kidney 
5 failure (see, Emmons eta/. (1999) Nature 407:339-340). Since ADPKD 
is inherited as an autosomal dominant disorder, children of affected 
parents have a one in two chance of inheriting the disease. Although the 
kidney is the most severely affected organ, the disease is systemic and 
affects the liver, pancreas cardiovascular system and cerebro-vascular 

10 system. The major manifestation of the disorder is the progressive cystic 
dilation of renal tubules (Gabow (1990) Am. J. Kidney Dis. 76:403-413), 
leading to renal failure in half of affected individuals by age 50. 
Microdissection, histochemical and immunologic studies show that cysts 
in ARPKD kidneys arise from focal dilations of medullary collecting ducts 

15 (McDonald (1991) Semin. Nephrol. 77:632-642). Although end-stage 

renal failure usually supervenes in middle age (ADPKD is sometimes called 
adult polycystic kidney disease), children may occasionally have severe 
renal cystic disease. 

ADPKD-associated renal cysts may enlarge to contain several liters 

20 of fluid and the kidneys usually enlarge progressively causing pain. 

Other abnormalities such as hematuria, renal and urinary infection, renal 
tumors, salt and water imbalance and hypertension frequently result from 
the renal defect. Cystic abnormalities in other organs, including the liver, 
pancreas, spleen and ovaries are commonly found in ADPKD. Massive 

25 liver enlargement can causes portal hypertension and hepatic failure. 

Cardiac valve abnormalities and an increased frequency of subarachnoid 
and other intracranial hemorrhage have also been observed in ADPKD. 
Progressive renal failure causes death in many ADPKD patients and 
dialysis and transplantation are frequently required to maintain life in 

30 these patients. 
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Numerous biochemical abnormalities associated with this disease 
also are observed. These include defects in protein sorting, the 
distribution of cell membrane markers within renal epithelial cells, 
extracellular matrix, ion transport, epithelial cell turnover, and epithelial 
5 cell proliferation. 

Three distinct loci have been shown to cause phenotypically 
indistinct forms of the AKPKD in humans. These include polycystin-1 
(PKD1) on chromosome 16, polycystin-2 (PKD2) on chromosome 4, and 
polycystin-3 (PKD3) (see, e.g., Reeders et al. (1985) Nature 377:542- 

10 544; Kimberling et al.. (1993) Genomics 18:461-412; Daoust et al. 
(1995) Genomics, 25:733-736). The ARPKD mutation is on human 
chromosome 6 (Zerres et al. (1993) Nature Genet. 7:429-432). Two 
proteins polycystin-1 (PKD1) and polycystin-2 (PKD2) are defective in 
human autosomal dominant polycystic kidney disease. 

15 Mutations in either PKD1 or PKD2 cause almost indistinguishable 

clinical symptoms. Mutations in PKD1 or PKD2 account for 95% of 
autosomal dominant polycystic disease (Torres et al. (1998) Current 
Opinion in Nephrology and Hypertension 7:159-169) with greater than 
85-90% of disease incidence being due to mutations in PKD1 . 

20 The human PKD1 protein is an approximately 4,300 amino-acid 

integral-membrane glycoprotein with a large amino-terminal extracellular 
domain and a small, carboxy-terminal cytoplasmic tail. The human PKD1 
gene (see, e.g., U.S. Patent No. 5,891,628), including the complete 
nucleotide sequence of the gene's coding region (se SEQ ID No. 1) and 

25 encoded amino acid sequence, is known (see, SEQ ID No. 2). The 

predicted structure of the domains suggested that it is involved in cell-cell 
interactions or in interactions with the extracellular matrix. The PKD2 
protein has similarities to PKD1, but its topology and domain structure 
suggest that it might act as a subunit of a cation channel. These proteins 

30 have been shown to interact directly (Mochizuki et al. (1996) Science 
272:1339-1342, Qian (1997) Nature Genetics 76:179-183). 
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Although these genes have been implicated in the disorders their 
role in it etiology is not established. In addition, while studies of kidneys 
from ADPKD patients exhibit a number of different biochemical, structural 
and physiological abnormalities, the disorder's underlying causative 
biochemical defect is not known. Hence the molecular mechanisms 
leading to cyst enlargement and progressive loss of renal function in the 
PKDs are not understood. Presently there are no cures or effective 
treatments, other than palliative treatments, for these diseases. Hence 
there is a need to understand the underlying biochemistry and physiology 
of the ADPKD and to provide treatments. 

Therefore, it is an object herein to provide a means to identify the 
underlying biochemistry and genetics of these diseases and to provide a 
means to identify compounds for use in treatment of these diseases. 
SUMMARY 

Isolated genes, cDNA and encoded proteins from nematodes that 
participate in a pathway leading to an observable phenotype are provided. 
In particular, it is shown herein, that a mutation in C. elegans, which 
gives rise to males that are defective in certain aspects of mating 
behavior, lies in a gene designed herein fov-7 (location of vulva), and that 
this gene is an ortholog of the mammalian, particularly human, PKD1 
gene. A mutation in a gene designated pkd-2 herein also gives rise to 
these behaviors. This gene is shown to be an ortholog of the mammalian, 
including human, PKD2 gene. 

The expression pattern of lov-1 and pkd-2 was studied and it was 
found that promoter sequences of both genes cause reporter genes to be 
expressed in the rays and the hook sensory neurons required for 
'response" and vulva location. Thus showing that the LOV-1 and PKD-2 
proteins are involved in chemosensory or mechanosensory signal 
transduction in sensory neurons. 
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Hence genes that are components of a pathway in nematodes are 
provided and are shown to be linked to observable behaviors. Each of the 
encoded proteins, LOV-1 and PKD-2 are components in a pathway, which 
appears to be a signal transduction pathway, that leads to the observed 
5 phenotype. The genes from the nematode Caenorhabditis elegans are 
exemplified herein. 

The pathway is shown to be homologous to the pathway in which 
the human polycystins, PKD1 and PKD2, participate. In particular, it is 
shown herein, that a mutation in nematodes, which gives rise to males 
10 that are defective in mating behavior, lies in a gene designated herein 

lov-1 {location of vulva). This gene, !ov-1 , is shown herein to be required 
for two male sensory behaviors, 'response' and 'location of vulva' (Lov). 

A second gene, designated pkd-2, that affects this behavior in a 
similar manner is also identified and provided herein. The encoded 
15 proteins are also provided. The gene, cDNA, and encoded protein is also 
provided. In an exemplary embodiment, the C. elegans genome sequence 
was used to isolate pkd-2. This gene is a nematode ortholog of the 
mammalian, particularly human PKD2 gene. Strains that contain knock- 
out mutants of this gene also exhibit the defective mating behaviors. 
20 In an exemplary embodiment, provided herein are the C. elegans 

genes, designated lov-1 and pkd-2. SEQ ID No. 3 sets forth the 
complement (i.e., the non-coding strand) of the lov-1 gene from C. 
elegans. SEQ ID No. 4 sets forth the sequence of amino acids of the 
protein (N-terminus to C-terminus)). SEQ ID No. 5 sets forth the 
25 complement {i.e., the non-coding strand) of the C. elegans pkd-2 gene 
from C. elegans. SEQ ID No. 6 sets forth the encoded sequence of amino 
acids. 

Also provided are the mutants of the genes, lov-1 , and pkd-2 and 
the resulting mutant encoded proteins. Nucleic acid molecules encoding 
30 mutants of these genes are also provided. For example, deletion mutants 
of these genes, particularly deletion mutants that substantially or 
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completely knock-out gene product function, are provided. Thus, nucleic 
acid molecules containing deletions of each of these genes and deletion 
mutants that alter the phenotype of nematodes, such as C. elegans, that 
contain these mutant genes are also provided. Constructs, vectors, 
5 plasmids and strains containing each of the nucleic molecules are also 
provided. Also provided are strains defective in these genes. 

Also provided are strains containing the mutant nucleic acids. 
Strains that manifest the defective male sensory behaviors are also 
provided herein. Constructs containing the genes, vectors containing the 
0 constructs, cells containing the vectors and transgenic C. elegans. 
Assays that use these strains of C. elegans are also provided. 

As noted, it is shown herein that these genes are human homologs 
of the human genes that encode polycystins, proteins pofycystin-1 (PKD1) 
and polycystin-2 <PKD2), which are defective in human autosomal 
5 dominant polycystic kidney disease. Hence, the genes and nematode 
strains provide model systems for studying this pathway, identifying 
additional components of the pathway, and for use in drug screening 
assays to identify compounds affect the pathway and/or compounds that 
serve as leads for development of drugs for treatment of polycystic 
kidney disease. 

Each gene is shown to affect two sensory behaviors in C. elegans. 
One behavior designated "Response" and refers to the response of males 
to hermaphrodites; and the other behavior, designated "Lov" refers to 
location of the vulva by the male. Strains that are defective in either or 
both of these genes are also provided. In particular deletion mutants are 
provided. 

By correlating the phenotypic behaviors with wild-type or defects in 
these genes, nematodes, such as C. elegans, can be used to identify 
other genes involved in this pathway and also means for direct screening 
for lead candidate compounds for drugs for treatment of PKD. Identifica- 
tion of additional genes necessary for PKD function can provide additional 
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diagnostic tools for PKD. Hence, provided herein are mutant strains of C. 
elegans and assays that use the strains. 

Also provided herein are assays that employ the constructs, 
vectors, plasmids and strains containing each of the nucleic molecules are 
5 also provided. In particular, in one type of assays wild-type nematodes 
are mutagenized or treated with a test compound, and those that exhibit 
a change in behavior are identified. 

In other types of assays, nematodes that are defective in LOV 
and/or Response are mutagenized or treated with a compound, and those 
10 that exhibit a change in behavior are identified. Test compounds or 
mutations responsible for the change in behavior are identified. Such 
compounds are candidates for treatment of PKDs. 

Among these methods are those that involved contacting a 
nematode that exhibits normal mating behavior with a test compound; 
15 and selecting compounds that result in altered mating behavior, wherein 
the altered mating behavior comprises alteration in the behavior involving 
location of vulva and/or response to contact with the hermaphrodite. 

Also provided are methods for identifying genes involved in 
autosomal dominant polycystic kidney disease (ADPKD). Among these 
20 methods are those in involving mutagenizing nematodes that exhibit 
normal mating behavior; and identifying and selecting nematodes that 
exhibit altered mating behavior, where the altered mating behavior is 
manifested as an alteration in location of vulva and/or response to 
contact with the hermaphrodite. The mutated gene(s) responsible for the 
25 alteration in behavior are then identified. Databases or libraries of 

mammalian genes can be screened to identify homologs of these genes, 
which can then serve as therapeutic or diagnostic targets or aid in 
elucidation of the disease pathology. 

Methods for identifying compounds that are candidate therapeutic 
30 agents for treatment of autosomal dominant polycystic kidney disease 
(ADPKD) are provided. Among the methods are those in which normal 
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males are treated with a candidate compound. Compounds that result in 
changes in mating behaviors or changes in mating efficiencies are 
selected. 

Methods for identifying genes involved in the disease pathway are 
5 also provided. Among the methods are those in which normal males are 
mutagenized. Offspring that exhibit changes in mating behaviors or 
changes in mating efficiencies are selected and mutated genes are 
identified and shown to be part of the pathway. Mammalian, particularly 
human, homologs of the mutated genes are then identified. Such genes 
10 are likely to be part of the disease pathway. Such genes can serve as 
therapeutic targets and disease markers for diagnostic. 

Other assays use nematode strains that have mutations in either or 
both of lov-1 or pkd-2. As described herein, suppressor and enhancer 
genetics can be used to assign functions to genes, to assign genes to 
15 pathways, to identify the key switches in these pathways and to provide 
a sensitive assay to identify new genes in a pathway and lead compounds 
that modulate the activity of genes and/or gene products in the pathway. 

Assays that identify the role of PKD proteins in sensory function 
are also provided. Since lov-1 and pkd-2 are expressed in CEM neurons, 
20 they have activity in other sensory functions, such as finding the mating 
partner at a distance. Accordingly assays using sexual chemotaxis or 
kinesis are provided. For example, males that are mutagenized or treated 
with a test compound are placed on a surface containing males and 
hermaphrodites, and are then observed to assess whether they can 
25 choose between males and hermaphrodites. If the male is defective in 
this sensory function, it will not distinguish between males and 
hermaphrodites. 

Assays that use dominant negative forms of PKD in nematodes or 
in other cells to identify mutations and/or compounds that inhibit PKD 
30 function are also provided. Transgenic nematodes that express a version 
of the LOV-1 or PKD-2 protein that inhibits the activity of LOV-1 and/or 
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PKD-2 as assessed by manifestation of the altered LOV and/or response 
phenotypic behavior(s) are used in these assays. Transgenic nematodes 
can be produced by any method known to those of skill in the art, 
including, but are limited to, injection of the nucleic acid into the embryos 
or cells of the animal. Transgenic nematodes that contain a dominant 
negative lov-1 or pkd-2 transgene are contacted with a test compound, 
and compounds that interfere with a remaining activity of the LOV-1 or 
PKD-2 protein are selected. Alternatively, these transgenic nematodes 
are mutagenized and mutants that lose a remaining activity are selected 
and the gene or mutation responsible for the loss or that contributes to 
the loss is identified. 

Assays based on localization and trafficking of LOV-1 and/or PKD-2 
within a cell or cells are also provided. These assays can identify 
regulators and factors necessary for synthesis and transport of LOV-1 
and/or PKD-2 proteins and employ strains in which LOV-1 and PKD-2 are 
expressed linked to a detectable label, such as a fluorescent protein. 
These strains are used to assess the effects of compounds or 
mutagenesis on the trafficking patterns of LOV-1 and PKD-2 and cellular 
location(s) of the proteins in the animal. Identified mutations can be 
mapped and the genes identified. If mammalian, particularly human, 
homologs of these identified genes exist, such genes can serve as 
therapeutic or diagnostic targets and can aid in elucidation of the disease 
in mammals, particularly humans. 

Assays for identification of transcriptional regulators of expression 
of lov-1 and I 'or pkd-2 are also provided. These assays screen for loss or 
alteration of expression of either gene and use transgenic nematodes with 
a reporter gene, such as a gene encoding a FP or lacZ or other detectable 
product, linked to the nucleic acid encoding lov-1 or pkd-2. The animal is 
mutagenized or treated with a test compound and loss of expression or 
reduction in expression of either gene is assessed. These assays identify 
regulators of and factors that affect lov-1 and pkd-2 expression. 
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Mammalian, particularly human homologs of these regulators and factors 
are identified. Such regulators and factors can be therapeutic or 
diagnostic targets, and/or can aid in developing an understanding of the 
development and progression of PKD in mammals. 

Kits for performing the assays, particularly, the drug screening 
assays, are also provided. The kits include transgenic or wild-type 
nematodes or both that express either wild-type or a mutant or a 
transgenic form of lov-1 and/or pkd-2. The nematodes may be on plates, 
in wells or in any form suitable for the assays. Kits containing nucleic 
acid encoding either of the two genes or probes based upon these 
sequences or reporter gene constructions containing all or portions of 
either or both genes are also provided. The nucleic acids may be in 
solution, in lyophilized or other concentrated form, or may be bound to a 
suitable substrate. The kits can include additional reagents for performing 
the assays, such reagents include any for performing any of the steps of 
the methods. The kits include instructions for performing the assays. 
DESCRIPTION OF FIGURES 

Figure 1 depicts male mating behavior of C. elegans. The 
hermaphrodite is larger than the male and her vulva is depicted as a slit 
on the ventral, posterior third of her body. The male tail is place flush on 
the hermaphrodite, ventral side down. His spicules are depicted by a line 
in the tail. The hook is anterior to the spicules, the post cloacal sensilla is 
posterior. Sequence 1 illustrates wild-type male Lov. Sequence 2 
represents hook ablated aberrant Lov behavior (passing and slow search). 
Sequence 3 portrays lov-1 (sy552) mutant behavior (passing and 
eventually stopping). 

Figure 2 depicts the molecular nature of lov-1, a, Genetic and 
physical maps of the lov-1 region on chromosome 2. Genetic markers are 
shown. Boundaries of a lov-1 deletion (mnDf21) and non-deletion (eDf21) 
are indicated. + designate rescue of lov-1 (sy552) mutant males. 
Numbers in parentheses indicate the ratio of rescuing stable lines to total 
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stable lines examined, b, lov-1 gene structure. Exons are boxed. 
Genefinder predicts two ORFs, ZK945.10 (9 exons) and ZK945.9 (19 
exons). RT-PCR reveals lov-1 corresponds to the combination of 
ZK945.10 and ZK945.9. The arrow indicates the 1059 bp deletion in lov- 
1 (sy582A) c, lov-1 ::GFP (green fluorescent protein) expression 
constructs, patterns, and phenotypes in wild-type background, d, lov-1 
encodes a membrane associated protein with homology to the polycystin 
and voltage-activated channel families. A schematic representation of 
LOV-1 is shown to demonstrate domains of the protein. These include 
the amino terminus that is serine/threonine rich with multiple potential 
glycosylation sites, an ATP/GTP binding domain (indicated by the 
asterisks), followed by two polycystin blocks of homology. Block 1 is 
exclusively homologous to PKD1 , while Block 2 shows homology with all 
polycystins and also the family of voltage activated CA 2+ channels. Block 
1 is a conserved domain of unknown function, that also occurs at the N- 
terminus of most 5-lipoxygenases. Identity (%) and number of identical 
amino acids (in parentheses) between LOV-1 and a particular polycystin is 
indicated. Although LOV-1 lacks the carboxy terminal coiled-coil domain 
of all known polycystins, a coiled-coil is predicted in the middle of LOV-1 
using the most stringent criteria for the COILS program (data not shown). 
Y73F8A.B + A was identified in a Blast search of unpublished sequences 
available through the Sanger Center and is more similar to PKD2 (30% 
identity, 48% similarity, 13% gaps over 752 aa) than LOV-1 (25% 
identity, 44% similarity, 14% gaps over 367 aa). 

Figure 3 shows the lov-1 and pkd-2 genomic structures, 
constructs, rescue date and expression patterns; the line above lov-1 
indicates the 1,059 bp deletion in lov-1 (sy582A); numbers in parentheses 
indicate the ratio of rescuing stable lines to the number of stable lines 
examined, DN is dominant negative. 
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Figure 4 shows that lov-1 ::GFP1 and PKD-2::GFP2 are colocalized 
to cell bodies and dendrites and are specifically expressed in adult male 
sensory neurons; the spicules, hook structure and posteriomost fan region 
autofluoresce; Arrows indicate neuronal cell bodies and arrowheads 
denote dendrites or ciliated endings, a-c lov-1 ::GFP1: (a) HOB and ray cell 
bodies (arrows), HOGB dendridic process (arrowhead); (b) HOB and ray 
process 5 (arrowheads); (c) Ciliated endings in nose tip from male 
specific cephalic CEM neurons (cell bodies not shown), d-f pkd-2::GFP2\ 
(d) ray cell bodies (arrow) and ray process 2 (arrowhead); (e) ray process 
5 (arrowhead); (f) male-specific celphalic CEM ciliated endings (arrow) 
Scale bar corresponds to 20 fjm. 
DETAILED DESCRIPTION 
Definitions 

Unless defined otherwise, all technical and scientific terms used 
herein have the same meaning as is commonly understood by one of skill 
in the art to which this invention belongs. Caenorhabditis elegans 
nomenclature is well understood by those of skill in this area (see, e.g., 
Methods in Cell Biology C. elegans I, and II, Cold Spring Harbor Press 
Books, Shakes, Epstein eds). 

All patents, patent applications and publications referred anywhere 
herein, including the background, are, unless noted otherwise, 
incorporated by reference in their entirety. In the event a definition in this 
section is not consistent with definitions elsewhere, the definition set 
forth in this section will control. 

As used herein, nematode is intended to refer generally to the class 
Nematoda or Nematoidea and includes those animals of a slender 
cylindrical or thread-like form commonly called roundworms. Among 
those species, members of the genus Caenorhabditis are preferred, but 
species that can be cultured in the laboratory may be used. 

As used herein, the term "mutant," as in "nematode mutant" or 
"mutant nematode," is intended to refer generally to a nematode which 
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contains an altered genotype, preferably stably altered. The altered 
genotype results from a mutation not generally found in the genome of 
the wild-type nematode. 

As used herein, a mutant gene, such as a mutant lov-1 or pkd-2 
gene, refers to a gene that is altered, whereby a nematode with such 
gene, expresses an altered phenotype compared to a nematode with the 
wild type gene, such as a the genes set forth in SEQ ID Nos. 3 and 5 
(which set forth the non-coding strands). Mutations include point 
mutations, insertions, deletions, rearrangements and any other change in 
the gene that results in an altered phenotype. Deletion mutants that 
eliminate the function of the encoded protein (knock-out mutations) are 
exemplified herein. Not all mutantations necessarily completely destroy 
the activity of the protein. 

As used herein, "normal mating behavior" means that the animal 
exhibits behavior typical of wild-type nematodes with respect to the 
location of vulva (Lov) and response to of males to hermaphrodites. Thus 
a male that exhibits "normal mating behavior" upon encountering a 
hermaphrodite, ceases forward motion, places his tail flush on the 
hermaphrodite, commences backing along her body, and turns at her ends 
until he encounters her vulva and stops. This is the behavior of a lov- 
1( + ) male. Mutant males defective in lov-1 frequently do not respond to 
contact with the hermaphrodite and continue blindly moving forward. 
When response is initiated, lov-1 mutants back and turn normally but 
pass the vulva at a high frequency. Thus, they can mate with paralyzed 
or otherwise slow moving hermaphrodites. 

As used herein, a mammalian homolog of a nematode gene refers 
to a gene that encodes a protein that exhibits identifiable sequence 
homology and conservation of structure. The degree of sequence 
homology between a mammalian and nematode protein or gene to be 
considered hmologs, depends upon the gene considered but is typically at 
least about 30% at the protein level. An ortholog will typically have 
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greater sequence similarity, and conservation of structure and often 
function. Methods and criteria for identifying mammalian, including 
human, homologs of nematode genes are known to those of skill in the 
art and involve a comparison of the sequence and structural features of 
the encoded protein. 

As used herein, a dominant negative mutation is a mutation that 
encodes a polypeptide that when expressed disrupts that activity of the 
protein encoded by the wild-type gene (see, Herskowitz (1987) Nature 
325:219-222). The function of the wild-type gene is blocked, a cloned 
gene is altered so that it encodes a mutant product that inhibits the wild- 
type gene product in a cell or organism. As a result, the cell or organism 
is deficient in the product. The mutation is "dominant" because its 
phenotype is manifested in the presence of the wild-type gene, and it is 
"negative" in the sense that it inactivates the wild-type gene function. It 
is possible to do this because proteins have multiple functional sites. 

As used herein, a "library" of nematodes is a collection of a 
plurality of nematodes, typically more than 10, preferably more than 100. 
Typically a library will include variety of different nematodes and may 
include wild-type and mutant nematodes and a sufficient number to 
achieve the intended purpose for which the library is used.. 

As used herein, a gene encoding LOV-1 protein refers to a gene (a 
sequence of nucleotides including introns, and exons, and optionally 
transcriptional regulatory sequences) from any nematode that encodes a 
protein that performs the same function in the nematode as the LOV-1 
protein provided herein. Such protein can be identified using the 
methods provided herein for identifying it in C. elegans, or by isolating 
cDNA encoding the protein using probes constructed from the nucleic 
acid provided herein to isolate it using standard methods. Typically the 
coding sequence of the gene provided herein will hybridize along its 
length to the coding sequence of a related gene under conditions of at 
least low stringency, preferably moderate stringency, and likely under 
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conditions of high stringency. Nucleic acid encoding a LOV-1 protein 
includes any nucleic acid molecule, DNA, cDNA, RNA, that encodes a 
protein that has substantially the sequence of amino acids set forth in 
SEQ ID No. 4 and encodes a protein that has the same activity as this 
protein. Minor sequence variations from species to species and even 
among a species are considered to be substantially the same sequence. 
Such nucleic acid will hybridize to the nucleic acid encoding the proteins 
provided herein under conditions of at least low stringency, preferably 
moderate stringency and more preferably high stringency. 

As used herein, a gene encoding PKD-2 protein from a nematode is 
similarly defined, except that it has the substantially the same sequence 
as the sequence of amino acids set forth in SEQ ID No. 6. Having 
identified these proteins and functions therefor in C. elegans permits 
similar identification in other nematode species. 

As used herein, stringency conditions refer to the washing 
conditions for removing the non-specific probes and conditions that are 
equivalent to either high, medium, or low stringency as described below: 

1) high stringency: 0.1 x SSPE, 0.1% SDS, 65 °C 

2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50°C 

3) low stringency: 1 .0 x SSPE, 0.1 % SDS, 50°C. 

It is understood that equivalent stringencies may be achieved using 
alternative buffers, salts and temperatures. 

As used herein, percentage or amount or degree of sequence 
identity is used interchangeable with homology and refers to sequence 
identity or homology determined using standard alignment programs with 
gap penalties and other parameters set to the manufacturer's default 
settings. It is understood that for relatively high levels of sequence 
identity or homology, the particular program selected and/or defaults set 
for various parameters, do not substantially affect the results. Hence, for 
example, a requirement for 90% sequence identity of a nucleic acid 
sequence with another can be determined using any program known to 
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the skilled artisan or manually, and that such percentage can encompass 
about 85% to 95% identity. 

As used herein, reference to a drug refers to a chemical entity, 
whether in the solid, liquid, or gaseous phase that is capable of providing 
a desired therapeutic effect when administered to a subject. The term 
"drug" should be read to include synthetic compounds, natural products 
and macromolecular entities such as polypeptides, polynucleotides, or 
lipids and also small molecules, including, but are not limited to, 
neurotransmitters, ligands, hormones and elemental compounds. The 
term "drug" is meant to refer to that compound whether it is in a crude 
mixture or purified and isolated. 

As used herein, heterologous or foreign DNA and RNA are used 
interchangeably and refer to DNA or RNA that does not occur naturally as 
part of the genome in which it is present or which is found in a location 
or locations in the genome that differ from that in which it occurs in 
nature. Heterologous nucleic acid is generally not endogenous to the cell 
into which it is introduced, but has been obtained from another cell or 
prepared synthetically. Generally, although not necessarily, such nucleic 
acid encodes RNA and proteins that are not normally produced by the cell 
in which it is expressed. Any DNA or RNA that one of skill in the art 
would recognize or consider as heterologous or foreign to the cell in 
which it is expressed is herein encompassed by heterologous DNA. 
Examples of heterologous DNA include, but are not limited to, DNA that 
encodes exogenous invertase. Heterologous DNA and RNA may also 
encode RNA or proteins that mediate or alter expression of endogenous 
DNA by affecting transcription, translation, or other regulatable 
biochemical processes. 

As used herein, operative linkage of heterologous DNA to 
regulatory and effector sequences of nucleotides, such as promoters, 
enhancers, transcriptional and translational stop sites, and other signal 
sequences refers to the relationship between such DNA and such 
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sequences of nucleotides. For example, operative linkage of heterologous 
DNA to a promoter refers to the physical relationship between the DNA 
and the promoter such that the transcription of such DNA is initiated from 
the promoter by an RNA polymerase that specifically recognizes, binds to 
and transcribes the DNA in reading frame. 

As used herein, a gene containing a heterologous transcriptional or 
translational or processing control region(s) refers to a nucleic acid 
molecule or construct that includes coding portion of a gene operatively 
linked to a such region derived from a different gene. A homologous 
transcriptional or translational or processing control region(s) refers to a 
nucleic acid molecule or construct that includes coding portion of a gene 
operatively linked to a such region derived from the same gene. 

As used herein, a promoter region refers to the portion of DNA of a 
gene that controls expression of DNA to which it is operatively linked. 
The promoter region includes specific sequences of DNA that are 
sufficient for RNA polymerase recognition, binding and transcription 
initiation. This portion of the promoter region is referred to as the 
promoter. In addition, the promoter region includes sequences that 
modulate this recognition, binding and transcription initiation activity of 
the RNA polymerase. These sequences may be cjs acting or may be 
responsive to trans acting factors. Promoters, depending upon the nature 
of the regulation, may be constitutive or regulated. A constitutive 
promoter is always turned on. A regulatable promoter requires specific 
signals to be turned on or off. A developmentally regulated promoter is 
one that is turned on or off as a function of development. 

As used herein, regulatory sequences include, sequences of 
nucleotides that function, for example as transcriptional and translational 
control sequences. Transcriptional control sequences include the 
promoter and other regulatory regions, such as enhancer sequences, that 
modulate the activity of the promoter, or control sequences that modulate 
the activity or efficiency of the RNA polymerase that recognizes the 
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promoter, or control sequences are recognized by effector molecules, 
including those that are specifically induced by interaction of an 
extracellular signal with a cell surface protein. For example, modulation 
of the activity of the promoter may be effected by altering the RNA 
5 polymerase binding to the promoter region, or, alternatively, by interfering 
with initiation of transcription or elongation of the mRNA. Such 
sequences are herein collectively referred to as transcriptional control 
elements or sequences. In addition, transcriptional controls sequences, 
include sequences of nucleotides that alter translation of the resulting 

10 mRNA, thereby altering the amount of a gene product. 

As used herein, a reporter gene refers to a gene that encodes a 
detectable product. Such genes are well known to those of skill in the art 
and include, but are not limited to, genes encoding fluorescent proteins, 
particularly the well-known green fluorescent proteins, lacZ, enzymes and 

15 other such reporters known to be expressible and detectable in 

nematodes. These genes are linked to a gene of interest whereby upon 
expression a detectable fusion protein is produced. For purposes herein, 
such fusions are exemplified using an aequorin GFP (see, Chalfie eta/. 
(1994) Science 253:802-805; see, also U.S. Patent No. 5,741,668), but 

20 any such protein may be used. For example, GFP from Aequorea victoria 
contains 238 amino acids, absorbs blue light and emits green light; it has 
been cloned and its sequence characterized; various mutants are also well 
known. Nematode optimized codons may be selected. 

As used herein, a reporter gene construct is a nucleic acid molecule 

25 that includes a reporter gene operatively linked to transcriptional control 
sequences. Typically the construct will also include all or a portion of a 
the gene of interest, which herein is lov-1 and/or pkd-2, and the reporter 
gene will be under the control of the lov-1 or pkd-2 promoter and other 
regulatory regions. By operatively linked is meant linked whereby an in- 

30 frame fusion protein is produced upon expression of the construct and 
whereby the reporter gene product is active {i.e. produces a detectable 
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signal or is active). The reporter gene may be linked to the 3' or 5' end 
or in any other orientation whereby it is expressed and operates as a 
reporter. 

As used herein, isolated, substantially pure DNA refers to DNA 
molecules or fragments purified according to standard techniques 
employed by those skilled in the art, such as those described in Sambrook 
et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY). 

As used herein, expression refers to the process by which nucleic 
acid is transcribed into mRNA and translated into peptides, polypeptides, 
or proteins. If the nucleic acid is derived from genomic DNA, expression 
may, if an appropriate eukaryotic host cell or organism is selected, include 
splicing of the mRNA. 

As used herein, cloning vehicle or vector, which are used 
interchangeably, refers to a plasmid or phage DNA or other DNA 
molecules that replicate autonomously in a host cell, and that include one 
or a small number of endonuclease recognition sites at which such DNA 
may be cut in a determinable fashion without loss of an essential 
biological function of the vehicle, and into which DNA may be spliced in 
order to bring about its replication and cloning. The cloning vehicle may 
further contain a marker suitable for use in the identification of cells 
transformed with the cloning vehicle. Markers, include but are not limited 
to, tetracycline resistance and ampicillin resistance. 

Appropriate expression vectors are well known to those of skill in 
the art and include those that are replicable in eukaryotic cells and/or 
prokaryotic cells. Such expression vectors may remain episomal or may 
integrate into the host cell genome. Expression vectors suitable for 
introducing heterologous DNA into plants and into host cells in culture, 
such as mammalian cells and methylotrophic yeast host cells, are known 
to those of skill in the art. It should be noted that, because the functions 
of plasmids, vectors and expression vectors overlap, those of skill in the 
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art use these terms, plasmid, vector, and expression vector, 
interchangeably. Those of skill in the art, however, recognize what is 
intended from the purpose for which the vector, plasmid or expression 
vector is used. 

As used herein, integrated into the genome means integrated into a 
chromosome or chromosomes. 

As used herein, a "fragment" of a protein refers to any portion of a 
protein that contains less than the complete amino acid sequence of the 
protein but that retains a biological or chemical function of interest. 

As used herein, expression vector or expression vehicle refers to 
such vehicle or vector that capable, after transformation into a host, of 
expressing a gene cloned therein. The cloned gene is usually placed 
under the control of (i.e., operably linked to) certain control sequences 
such as promoter sequences. Expression control sequences will vary 
depending on whether the vector is designed to express the operably 
linked gene in a procaryotic or eukaryotic host and may additionally 
contain transcriptional elements such as enhancer elements, termination 
sequences, tissue-specificity elements, and/or translational initiation and 
termination sites. 

As used herein, a variant of a protein refers to a 
protein substantially similar in structure and biological activity to 
either the entire protein or a fragment thereof. Thus, provided that two 
proteins possess a similar activity, they are considered variants as that 
term is used herein even if the composition or secondary, tertiary, or 
quaternary structure of one of the molecules is not identical 
to that found in the other, or if the sequence of amino acid residues is 
not identical. 

It is also understood that any of the proteins or portions disclosed 
herein may be modified by making conservative amino acid substitutions 
and the resulting modified subunits are contemplated herein. Suitable 
conservative substitutions of amino acids are known to those of skill in 
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this art and may be made generally without altering the biological activity 
of the resulting molecule. Those of skill in this art recognize that, in 
general, single amino acid substitutions in non-essential regions of a 
polypeptide do not substantially alter biological activity {see, e.g. , Watson 
et aL Molecular Biology of the Gene, 4th Edition, 1 987, The 
Benjamin/Cummings Pub. Co., p. 224). Such substitutions are preferably, 
although not exclusively, made in accordance with those set forth in 
TABLE 1 as follows: 

TABLE 1 



15 



Original residue 


Conservative substitution 


Ala (A) 


Gly; Ser 


Arg (R) 


Lys 


Asn (N) 


Gin; His 


Cys (C) 


Ser 


Gin (Q) 


Asn 


Glu <E) 


Asp 


Gly (G) 


Ala; Pro 


His (H) 


Asn; Gin 


He (I) 


Leu; Val 


Leu (L) 


lie; Val 


Lys (K) 


Arg; Gin; Glu 


Met (M) 


Leu; Tyr; lie 


Phe (F) 


Met; Leu; Tyr 


Ser (S) 


Thr 


Thr {T) 


Ser 


Trp (W) 


Tyr 


Tyr (Y) 


Trp; Phe 


Val (V) 


lie; Leu 



Comparable mutations may be made at the nucleotide sequence level. 

Other substitutions are also permissible and may be determined 
empirically or in accord with known conservative substitutions. Any 
such modification of the polypeptide may be effected by any means 
known to those of skill in this art. Mutation may be effected by any 
method known to those of skill in the art, such as by chemicals or 
radiation, and also including site-specific or site-directed mutagenesis of 
DNA encoding the protein and the use of DNA amplification methods 
using primers to introduce and amplify alterations in the DNA template. 
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As understood by those skilled in the art, assay methods for 
identifying compounds, such as antagonists and agonists, that modulate 
functioning of a protein or protein or pathway, generally require 
comparison to a control. One type of a "control" system is one that is 
treated substantially the same as the system, such as a worm, exposed 
to the test compound except that the control is not exposed to the test 
compound. Another type of a control may one that is identical to the test 
system, except that it does not express the gene or protein of interest. In 
this situation, the response of test system is compared to the response 
(or lack of response) of the control to the test compound, when each cell 
are exposed to substantially the same reaction conditions in the presence 
of the compound being assayed. 

As used herein, treatment means any manner in which the 
symptoms of a conditions, disorder or disease are ameliorated or 
otherwise beneficially altered. 

As used herein, amelioration of the symptoms of a particular 
disorder by administration of a particular pharmaceutical composition 
refers to any lessening, whether permanent or temporary, lasting or 
transient that can be attributed to or associated with administration of the 
composition. 

As used herein, a composition refers to any mixture of two or more 
components. It may be solution, suspension, or any other mixture. 

As used herein, biological activity refers to the in vivo activities of 
a compound or physiological responses that result upon in vivo 
administration of a compound, composition or other mixture. Biological 
activity, thus, encompasses therapeutic effects and pharmaceutical 
activity of such compounds, compositions and mixtures. 
Nematodes as disease models 

Nematodes serve as model organisms for the study of gene 
expression. Caenorhabditis elegans is representative of nematodes. It is 
a small, freeliving bacteriovorous soil nematode that is a member of the 
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Rhabditidae, a large and diverse group of nematodes found in terrestrial 
habitats. Some rhabdoids are pathogenic to or parasitic on animals. In 
common with other nematodes, C. elegans develops through four larval 
stages (also called juveniles) that are separated by moults. The lifecycle 
takes about 3 days at 20 ° C. 

C. elegans is only 1 mm long and can be handled in a manner 
similar to microorganisms, including growth on petri plates seeded with 
bacteria. In the laboratory, C. elegans is fed on E. coll. It has a 
transparent body and all somatic cells ( 959 female; 1031 male) are 
visible with a microscope. 

Although it is a primitive organism, it shares many of the essential 
biological characteristics, including embryogenesis, morphogenesis, 
development and aging that are central problems of human biology. The 
worm is conceived as a single cell that undergoes a complex process of 
development, starting with embryonic cleavage, proceeding through 
morphogenesis and growth to the adult. It has a nervous system with a 
'brain' (the circumpharyngeal nerve ring), It exhibits definable behaviors, 
and is capable of rudimentary learning. It produces sperm and eggs, 
mates and reproduces. After reproduction it gradually ages, loses vigor 
and dies. Its average life span is 2-3 weeks. 

Adult C. elegans are usually self-fertilizing protandrous 
hermaphrodites. As a result homozygous mutant stocks can be readily 
generated. The hermaphrodite gonad first produces germ cells that 
differentiate as sperm (about 250 sperm are produced) and then produces 
eggs. The fecundity is determined by the sperm supply. 

Nematodes, particularly C. elegans, is one of the most thoroughly 
understood of all multicellular organisms. The biology of its nervous 
system, which contains 302 neurons, is well-documented. Many C. 
elegans genes used have counterparts in mammals, including humans. At 
least half of the C. elegans genes and proteins that have been 
characterized have structures and functions similar to mammalian genes. 
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These include genes encode enzymes, proteins necessary for cell 
structure, cell surface receptors and genetic regulatory molecules. 

Animals from man to worm have most of their protein families in 
common and humans frequently have four to five close analogs of a 
5 protein family member, where worms have only one, Essentially all 
genes and pathways shown to be important in cell-, developmental- and 
disease-biology have been found to be conserved between worm and 
human. This conservation applies to the number and type of protein 
families, gene structure, the hierarchy of genes in genetic pathways and 
10 even gene regulation. 

A consequence of this conservation is that human genes can be 
inserted into the worm genome, to functionally replace the worm genes 
even in complex cell biological and signal transduction pathways. 
Conversely, key worm genes identified using genetics can be used to 
15 trigger specific biochemical processes in human cells and to serve as 
models for the human genes. 
Genetics Nomenclature 

C. elegans is diploid and has five pairs of autosomal chromosomes 
(designated I, II, III, IV and V) and a pair of sex chromosomes (X) that 

20 determine gender. XX is a hermaphrodite and XO is male. Males are 
found rarely (about 0.05% of normal lab populations). The commonest 
lab strain, and the designated "wild-type" strain, is called N2. 

For historical reasons C. elegans nomenclature is different from 
other species. Loci have a 3-letter dash one number designation. The 

25 letters are an acronym for the phenotype and the number is consecutive. 
Alleles have a single or double letter followed by a number. The letter 
identifies the isolating laboratory. Strains have a letter(s) number 
designation. The letters identify the isolating laboratory ( i.e. AB100 
abc-1 (xy1000) Strain AB100 which carries the xy1000 allele of abc-1 . 

30 The chromosomal location can be added: AB100 abc-1 (xy1 000) I. 

Multiple mutant alleles carried in one strain are organized by chromosome, 
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and chromosomes separated by semicolons. Heterozygous nematodes 
are designated by a abc-1/+ notation. Hence abc-1( + ) indicates the 
wild-type (N2 strain) copy of the gene. Proteins are capitalised and not 
italicized. ABC is the protein product of abc-1. 

Rearrangements, duplications and deficiencies have a letter prefix 
(indicating the isolating lab) a Dp (pronounced dupe, for duplication) or Df 
(pronounced dif for deficiency) and a number (i.e., xyDpl is duplication 
number 1 from xy and xyDfl is deficiency number 1 from xy lab). 
Transgenic strains carrying the transgene as a free extrachromosomal 
array are designated as follows: xyExl [abc-1 ( + )] is a transgenic strain 
carrying the wt copy of abc-1 . 

The C. elegans Genome 

The C. elegans genome, which is 97 Mb, contains six 
approximately equally sized chromosomes (5 autosomes, one X) and it 
has been sequenced (see, (1998) Science 252:2012-2018) and is publicly 
available. The 97 Mb encodes a predicted 19,099; although as shown 
herein, there remain ambiguities. Over 60,000 cDNA fragments have 
been tag sequenced and 101000 ESTs deposited. These "expressed 
sequence tags" or ESTs offer a set of snapshots of gene expression in the 
nematode, and have identified around half of the organism's genes. The 
cDNA data is used in the prediction of genes from the genome sequence 
along with database searches for similarities between C. elegans genes 
and those of other organisms such as humans. This estimate is based on 
the correspondence between genomic DNA sequence and cDNA 
sequences, and on the prediction of coding genes from genomic 
sequence. The genome data (and much else besides) is collated into an 
available database ACeDB, written for the C. elegans project. A physical 
map of the genome, which is publically available in the C. elegans 
genome database ACeDB, has been constructed. The map is based on 
17,000 cosmid clones of genomic DNA (insert size 35-40 kb). These 
clones were "fingerprinted" using restriction enzymes, and the 
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fingerprints used to order the clones in overlapping contiguous sets, or 

contigs. These cosmid contigs have been supplemented by a set of 3,000 

yeast artificial chromosome clones (insert sizes 100 kb and above). 

Because the yeast host tolerates sequences that E. coli does not, the 

5 YAC clones can "bridge" gaps between contigs of cosmids. With these 

two resources, contigs covering >95% of all the chromosomes have 

been assembled. The clones are freely available for researchers, and the 

3,000 YAC clones are available as an array on a filtermat, arranged in 

approximate chromosomal order, for screening purposes. 

10 The genomes of other nematodes are in the same size range. 

Brugia malayi, a filarial parasite of humans, has a genome of 100 Mb; 

Ascaris suum, the pig roundworm, has a larger germ line genome which 

undergoes somatic diminution. 

Identification of the genes associated with the location of vulva and 
15 response behaviors 

The behaviors 

The six sub-steps of the stereotyped copulatory sequence has been 
correlated with the function of individual neurons, and behavioral mutants 
have been isolated (Liu et al. Neuron 74:79-89). C. elegans male mating 

20 behavior includes a series of steps: response to contact with the 

hermaphrodite, backing along the body of the hermaphrodite, turning 
around her head or tail, location of the vulva, insertion of the two 
copulatory spicules into the vulva and sperm transfer. Sensory structures 
and neurons that participate in each of these steps have been identified: 

25 the sensory rays mediate response to contact and turning; the hook, the 
postcloacal sensilla and the spicules mediate vulva location; and the 
spicules also mediate spicule insertion and regulate sperm transfer. 

Thus, the stereotyped mating behavior of the Caenorhabditis 
elegans male comprises several substeps: response backing, turning, 

30 vulva location, spicule insertion, and sperm transfer (Fig. 1). The 

complexity of male mating behavior is reflected in the sexually dimorphic 
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anatomy and nervous systems of the male and hermaphrodite (Hodgkin, 
J. (1988) in The Nematode C. elegans (ed. Wood, B.) pp. 243-279 (Cold 
Spring Harbor Laboratory Press, New York). Behavioral functions have 
been assigned to most male-specific sensory neurons via cell ablations 
5 (Liu eta/. Neuron 74:79-89). Although the hermaphrodite is behaviorally 
passive, her vulva provides sensory cues to the male. 

Vulva location behavior is complex. The male stops and precisely 
positions his tail over the vulva, coordinates his movement to the 
hermaphrodite's, and ultimately insert his spicules into the vulva slit and 

10 transfers sperm into the uterus. The hook sensory neurons, HO A and 
HOB, are specifically required for location of vulva (Lov) behavior. 
Ablation of either HOA or HOB results in a Lov defect whereby the 
ablated male circles the hermaphrodite without stopping at the vulva 
(Fig. 1). Eventually, the ablated male begins an alternative search by 

15 backing slowly and prodding randomly with his spicules until the vulva is 
located. The postcloacal sensilla are required for slow search behavior. 
Vulva location behavior is executed by a minimum of eight sensory 
neurons with overlapping and redundant functions (Liu et al. Neuron 
74:79-89). 

20 A genetic analysis of vulva location behavior to investigate how 

genes specify sensory behavior, beginning with sensory reception was 
performed. The mating behavior of existing mutants defective in sensory 
behaviors including chemotaxis to soluble and volatile odorants, 
mechanosensation, and osmotic avoidance was first examined. From this 

25 survey, it was found that only males with severe defects in all sensory 
neuron cilia {osm-4, osm-5, osm-6, and che-3) were Lov defective 
(Table 2). For example, osm-6(p81 1) males locate the vulva with an 
efficiency of 32% versus 96% of wild-type (Table 2). These males are 
also response defective, but not so severely as to prevent observation of 

30 the Lov phenotype. The only ciliated cells in C. elegans are 

chemosensory and mechanosensory neurons (White et al. (1986) Philos. 
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Trans. R. Soc. Lond. B Biol. Sci. 374:1-340). The male tail possesses 
thirty predicted ciliated sensory neurons (Sulston eta/. (1980) Dev. Biol. 
75:542-576), consistent with the observation that ciliated neurons 
modulate response and Lov. osm-6::gfp is expressed exclusively in 
5 ciliated neurons, with male-specific expression in four CEM head neurons 
and neurons of the rays and copulatory spicules (Collet et al. (1998) 
Genetics 745:187-200). More detailed examination revealed that 
osm-6::gfp expression begins at the L4 stage in neuronal cell bodies and 
extends to dendrites as neuronal outgrowth proceeds (data not shown). 

10 The RnA and RnB neurons of each ray (ray 1 through ray 9), the HOA and 
HOB hook neurons, the spicule neurons SPV and SPD, and the PCB 
postcloacal sensilla neurons accumulate GFP. The osm-6 expression 
pattern and mutant phenotypes indicate that OSM-6 might be required for 
the structure and function of ciliated neurons in the adult male tail. In the 

15 hermaphrodite, osm-6 function is required for nose touch (Kaplan et al. 
(1993) Proc. Natl. Acad. Sci. U.S.A. 50:2227-2231), osmotic avoidance, 
chemotaxis, dye-filling of sensory neurons, thermotaxis, dauer formation, 
and proper assembly of ciliated sensory endings (Perkins et al. (1986) 
Dev. Biol. 1 77:456-487). Hence, ciliated endings are important for all 

20 known sensory behaviors, including Lov. 

TABLE 2. Vulva location behavior of wild-type and mutant males 



Genotype 


vulva location 
efficiency % 


Significantly different 
from wild-type (p value) 


T n 


him-5 (wild-type) 


96 






101 


osm-1(e1803) 


65 


No 


(0.0738) 




osm-4(p821 ) 


48 


Yes 


(0.0004) 




osm-5(p813); him-5 


26 


Yes 


(0.0002) 




osm-6(p811) 


32 


Yes 


(0.0003) 




che-3(e1124) 


69 


Yes 


(0.02666) 




tov-7(sy582AJ 


1 1 


Yes 


(<0.0001) 




!ov-1(sy582); him-5 


30 


Yes 


(<0.0001) 
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Table 2. lov-1 (sy522); him-5(e1490), lov-1 (sy582AJ, and all cilia defective mutant were 
also response defective. Males that eventually responded were scored for Lov behavior. 
T n represents the number of males observed, each for a minimum of 10 vulva encounters 
per male. Mann-Whitney tests determined p values. The following non-cilia-defective 
5 osmotic avoidance [osm), mechanosensory defective (mec), chemosensory defective 
(che), odorant response abnormal (odr) and dauer formation defective (daf) mutants 
were also examined and found to be normal for response and Lov behavior: osm- 
3(e1806); him-5(e1490), osm-7(nl 51 5), osm-8(nl 51 8), osm-10(n1604),osm- 
11(n1604), osm- 12(n 1606), mec-3(e1338) him-8(e1489), mec-4(e1 61 1 ), mec-5(e1 340), 
10 mec-7{n434), mec-7(e1 343), mec-8(e398), mec-9(e1494), che- 11 2, odr-1 (n1 936), odr- 
2(n2145), odr-3(n2150), odr-4(n2144ts), odr-5, odr-6(kyf), odr-7(ky4, odr-1 0(ky32) and 
daf-11(m47ts). 

Provided herein are mutants that are defective in location of the 
vulva (Lov). Lov mutant males are unable to execute this step. In 

15 addition, these males are also defective in the first sub-step, 'response'. 
Response and vulva location depend on two types of male sensory 
structure: the first is a set of nine pairs of rays, which project out of the 
tail on each side; and the second is a hardened cuticular structure called 
the hook, which contains two sensory neurons. These mutants were 

20 used to identify the genes involved in these behaviors. 
Identification and cloning of the lov-1 gene 

To elucidate the molecular basis of behavior and sensory the 
mutants are studied and genes associated with the behaviors are 
identified. A gene designated lov-1 that is required for two male sensory 

25 behaviors, response and location of vulva (Lov) is described herein. It is 
also associated with other sensory behaviors controlled by the CEM 
neurons. 

This gene, lov-1 , encodes a putative membrane protein with a 
mucin-like, serine-threonine rich amino terminus (Carraway et al. (1995) 

30 Trends Glycoscience Glycotechnology 7:31 -44) followed by two blocks of 
homology to human polycystins encoded by the autosomal dominant 
polycystic kidney disease (ADPKD) genes (Torres et al. (1998) Current 
Opinion in Nephrology and Hypertension 7:159-169). LOV-1 and human 
PKD1 are 26% identical in block 1 . Block 2 also shows 20% identity 

35 between LOV-1, all identified polycystins (PKD1, PKD2, and PKDL), and 
the family of voltage-activated channels (Torres et al. (1998) Current 



-29- 



18021-2901 



Opinion in Nephrology and Hypertension 7:159-169). Overall, LOV-1 is 
the closest C. elegans homolog of PKD1 . The polycystin/channel domain 
(block 2) of LOV-1 is required for function. Lov-1 is specially expressed 
in adult male sensory neurons of the rays, hook, and head, mediating 
5 response, Lov, and potentially chemotaxis to hermaphrodites, respectively 
(Liu eta/. Neuron 74:79-89, Ward eta/. (1975) J. Comp. Neurol. 
760:313-337). Localization of lov-1 to neuronal cell bodies and ciliated 
sensory endings is consistent with a role in either chemo- and/or 
mechanosensory reception and signaling. Human PKD proteins might 
10 similarly be involved in sensory reception during osmoregulation, 
organogenesis and/or organ maintenance. 
Cloned genes and encoded proteins 

To identify genes specifically required for male sensory behaviors, 
mutants defective in Lov were screened. Lov-1 (sy552) males have 

15 specific response and Lov defects. Upon encountering a hermaphrodite, a 
lov-1 (+) male ceases forward motion, places his tail flush on the 
hermaphrodite, commences backing along her body, and turns at her ends 
until he encounters her vulva and stops. Mutant males defective in lov-1 
frequently do not respond to contact with the hermaphrodite and continue 

20 blindly moving forward. When response is initiated, lov-1 mutants back 
and turn normally but pass the vulva at a high frequency. The response 
and vulva location ability of lov-1 (sy552) is 30% that of lov-1 ( + ) males 
(Table 2). Spiculte insertion and sperm transfer behaviors are unaffected. 
lov-1 (sy552) males exhibit high mating efficiency with severely paralyzed 

25 unc-52 hermaphrodites but sire few progeny with actively moving dpy-17 
hermaphrodites. Differences between mating efficiencies is partner- 
dependent. A paralyzed partner is an easier target for the lov-1 mutant 
male who is defective in response and Lov but unimpaired in the 
behaviors of backing, turning, spicule insertion, and sperm transfer. The 

30 behavioral defects of sy552 are limited to male mating. Lov-1 (sy552) 
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mutants appear normal for other sensory behaviors including egg laying, 
nose touch, tap, mechanosensation, and osmotic avoidance. 

The lov-1 gene was cloned by genetic mapping and transformation 
rescue of the sy552 behavioral defects (Fig. 2a). mnDf2//sy552, 
5 mnDf83/sy552 and sy552/sy552 males are phenotypically indis- 
tinguishable; therefore, sy522 is reduction or loss of function mutation in 
lov-1 . This conclusion is supported by the observed recessive nature of 
sy552. A 16.9 kb Hindlll subclone (plov-1.1) of the cosmid ZK945 
rescued response and Lov defects of sy552 (Fig. 2a). Both a 6.7 kb 

10 Hindlll-BamHI fragment from plov-1.1 (plov-1 ::GFP1 ) and a 14.1 kb 

Hindlll-Stul frameshift in plov-1.1 (plov-1. 3) fail to rescue sy552 defects 
(Fig. 2b) yet act in a dominant negative (DN) manner in wild-type males 
with respect to Lov behavior (Fig. 2c). Wild-type males expressing either 
plov-1 ::GFP or plov-1 .3 are Lov defective. These transgenic males 

15 exhibit a wild-type response to hermaphrodite contact. Without being 
bound by a theory, the differences in sy552 and transgenic DN pheno- 
types might be attributed to dosage or mosaicism. 

Figure 2b illustrates the intron-exon boundaries of the lov-1 gene. 
Using RT-PCR with lov-1 specific primers and him-5 mRNA, it was found 

20 that lov-1 encodes one transcript corresponding to Genefinder-predicted 
ORFs, ZK945.10 and ZK945.9 (Fig. 2b), which had been thought to be 
two genes. Lov-1 encodes a predicted 3178 amino acid membrane- 
bound protein (see SEQ ID Nos. 3 and 4) with a serine-threonine rich 
extracellular domain homologous to mucins (Carraway et al. (1995) 

25 Trends Glycoscience Glycotechnology 7:31-44), a polycystin homology 
block 1 (26% identity), and a carboxy terminal polycystin block 2 with 
20% identity to polycystin proteins 1, 2, and 2, encoded by the PKD1, 
PKD2, and PKDL (polycystic kidney disease) genes, respectively (Fig. 2d). 
A Kyte-Doolittle hydropathy plot predicts multiple transmembrane 

30 domains; although no signal peptide is predicted in LOV-1. Mucins are 
highly glycosylated extracellular proteins thought to serve cell adhesion 
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and/or protective functions (Carraway eta/. (1995) Trends Glycoscience 
Glycotechnology 7:3 1 -44) . 

Similarity between exons W (for PKD1 only), X, Y, Z, AA, BB, and 
CC of lov-1 and PKD1, PKD2, and the family of voltage-activated calcium 
5 and potassium channels in the six transmembrane spanning region has 
been observed (Mochizuki et al. (1996) Science 272:1339-1342). This 
extends to PKDL (Nomura et al. (1998) J. Biol. Chem. 273:25967- 
25973). LOV-1 lacks the Ca 2+ binding EF-hand of polycystin 2 and L, 
and a coiled-coil domain of all three polycystins (Fig. 2d), which has been 

10 shown to mediate hetero- and homotypic interactions between 

polycystin 1 and polycystin 2 (Qian (1997) Nature Genetics 76:179-183; 
Tsiokas et al. (1997) Proc. Natl. Acad. Sci. USA 54:6965-6970). Block 2 
also shows limited homology with the trp (transient receptor potential) 
family of channels (Montell et al. (1989) Neuron 2:1313-1323). The 

15 critical difference between voltage-gated and trp channels is the presence 
of a positively charged S4 transmembrane domain that acts as a voltage 
sensor (Montell et al. (1989) Neuron 2:1313-1323). LOV-1 more closely 
resembles voltage-gated channels in this respect. A frameshift disruption 
in lov-1 (plov-1 .3) one residue away from a corresponding nonsense 

20 mutation in human PKD2 (Mochizuki et al. (1996) Science 272:1339- 
1342) destroys the ability to rescue lov-7fsy552), as mentioned above. 
The construct plov-1. 3 encodes a truncated protein lacking the polycystin 
block 2/channel domain. These results demonstrate that the polycystin 
block 2/channel domain is essential for LOV-1 function, and indicate that 

25 functional as well as structural similarities might exist between LOV-1 and 
PKD-2. LOV-1 also possesses a nucleotide-binding domain (Fig. 2d) that 
is not present in the human polycystins. The structure of LOV-1 is also 
indicative of a role in signal transduction. 
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The lov-1 gene product appears to be a membrane spanning protein 
that includes an extracellular domain with a serine/threonine-rich mucin- 
like domain, an ATP-binding domain, and small cytoplasmic tails that 
mediate interaction with other members of the pathway, including a pkd-2 
5 gene product that is also a membrane spanning protein, with six 

membrane domains, and a cytoplasmic EF-hand. Interaction of these 
proteins lead to the observed phenotypic response. In c. elegans this 
response can be detected as a clearly identifiable phenotype. Hence, c. 
elegans and mutants thereof can serve as a test system for identifying 
10 compounds that alter this pathway and also for identifying other gene 
products involved in the pathway. 

lov-1 gene 

In an exemplary embodiment, the complement of the nucleic acid 
sequence of the lov-1 gene from C. elegans is provided. Corresponding 

15 genes from other nematodes may be identified, such as by using the 

nucleic acid provided herein and screening an appropriate library, genomic 
or cDNA library, using standard procedures. Alternatively, databases of 
sequence may be searched and the genes from other nematodes 
homologous to those provided herein identified, again using standard 

20 searching and alignment programs. 

SEQ ID NO. 3 is the complement of the genomic sequence of the 
lov-1 gene. It includes open reading frames (ORFs) between nucleotides 
15760 to 27880 of cosmid ZK945 (nucleotides 1 to 12121 of SEQ ID 
NO. 3) and nucleotides 1-564 of cosmid F27E5 (nucleotides 12122 to 

25 12685 of SEQ ID NO. 3). It was found herein, however, that ZK945 and 
F27E5 overlap from nucleotides 27881 to 27981 and nucleotides 1 to 
101, respectively (the overlap region includes nucleotides 12122 to 
12222 in SEQ ID NO. 3), thereby providing a single, rather than two, 
ORFs. 

30 It been thought that the open reading frame in cosmid ZK945 (the 

"ZK945.9" gene; nucleotides 1 to 9164 of SEQ ID NO. 3), and the open 
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reading from in cosmid F27E5 (the "ZK945.10" gene; nucleotides 9415 
to 12685 of SEQ ID NO. 3) encoded two genes. DNA sequence analysis 
of RT-PCR generated cDNA clones from him-5[e1490) RNA revealed three 
exons (exons I, J and K in Figure 2B) in the junction between ZK945.10 
5 and ZK945.9: one from nucleotides 25195 to 25742 of the ZK945 
cosmid {nucleotides 9436 to 9983 of SEQ ID NO. 3); a second from 
nucleotides 25071 to 25151 of the ZK945 cosmid (nucleotides 9312 to 
9392 of SEQ ID NO. 3); and a third initiating at position 25021 in the 
ZK945 cosmid (nucleotide 9262 of SEQ ID NO. 3). This demonstrated 

10 that the lov-1 gene encodes one large transcript corresponding to ORFs in 
ZK945.10 and ZK945.9, spanning what had previously been thought to 
encode two proteins. 

As noted above. Figure 2B depicts the lov-1 genomic structure 
(exons shown as boxes, introns as lines). With reference to Figure 2B, 

15 the coding sequence in the gene set forth in SEQ ID No. 3 (noting that 
SEQ ID 3 sets forth the non-coding strand) is as follows: 

Complement (Join (1 2500... 1 2685) - Exon A; (1 2266... 1 2451 ) - 
Exon B; (1 2085... 1 221 7) - Exon C; (1 1 683.. .1 1 823) - Exon D; 
(1 1498... 1 1637) - Exon E; (1 1 1 28.. .1 1 452) - Exon F; (1 0268. .. 1 0899) - 

20 Exon G; (1 01 38... 1 021 6) - Exon H; (9436. ..9983) - Exon I; 

(931 2. ..9392) - Exon J; (8685. ..9262) - Exon K; (8557. ..8635) - Exon L; 
(7830. ..7997) - Exon M; (6774. ..7786) - Exon N; (6648. ..6728) - Exon 
O; (6305. ..6598) - Exon P; (6006.. .6255) - Exon Q; (5732. ..5958) - Exon 
R; (4849. ..5076) - Exon S; (4698. ..4799) - Exon T; (4383.. .4651 ) - Exon 

25 U; (3336.. .4328) - Exon V; (2229.. .3094) - Exon W; (1 976. ..21 81 ) - 
Exon X; (1 635... 1 930) - Exon Y; (1 043. ..1 591 ) - Exon Z; (625. ..999) - 
Exon AA; (329. ..572) - Exon BB; (1...270) - Exon CC). 
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The LOV-1 amino acid sequence is set forth in SEQ ID NO, 4 The 
following table summarizes the above. 

TABLE 3 Comparison of Sequence ID No. 3 with source CosmidV 



5 



10 



15 



25 



EXON 


SEQ ID 3 


ZK945 


F27E5 


A 


12500.. 12685 




379.-564 


B 


12266.. 12451 




145. .330 


C 


12085. .12217 


27844.-27976 




D 


1 1683.. 1 1823 


27442. .27582 




E 


11498. .11637 


27257..27396 




F 


11128. .11452 


26887. .27211 




G 


10268. .10899 


26027. .26658 




H 


10138. .10216 


25897.-25975 




*l 


9436. .9983 


25195. .25742 




*J 


9312. .9392 


25151. .25071 




*K 


8685. .9262 


24444.. 25021 




L 


8557..8635 


2431 6. .24394 




M 


7830. .7997 


23589..23756 




N 


6774..7786 


22533. .23545 




O 


6648. .6728 


22407. .22487 




P 


6305. .6598 


22064.. 22357 




Q 


6006. .6255 


21765. .22014 




R 


5732. .5958 


21491. .21717 




S 


4849. .5076 


20608. .20835 




T 


4698..4799 


20457. .20558 




U 


4383. .4651 


20142. .20410 




V 


3336..4328 


19095. .20087 




**w 


2229. .3094 


17988. .18853 




X 


1976. .2181 


17735. .17940 




Y 


1635. .1930 


17394.. 17689 




z 


1043. .1591 


16802.. 17350 




AA 


625. .999 


16384.. 16758 




BB 


329. .572 


16088. .16331 
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EXON 


SEQ ID 3 


ZK945 


F27E5 


CC 


1..270 


15760.. 16029 





analysis, and not predicted by the GeneFinder program) 

5 **the sy582 lov-1 mutant has a 1059 bp deletion beginning in exon W at position 2267 
of SEQ ID NO. 3 (18026 of the ZK945 cosmid) and ending at position 1209 of SEQ ID 
NO. 3 (16968 of the ZK945 cosmid). 

T The GenBank accession numbers for ZK945 and F27E5 are (GenBank Accession No. 
10 Z48544) and (GenBank Accession No. Z48582), respectively. 

Exemplary knockout mutant sy582 

A genomic deletion of lov-1 in a PCR screen of EMS mutagenized 
worms was isolated, lov-1 (sy582A) encodes a truncated protein lacking 
the polycystin/cation channel homology domain (Fig. 2d). Like sy552, 

15 lov-1 (sy582A) males exhibit defects in response and Lov behaviors 

(Table 2), as well as low mating efficiency with dpy-17 but not unc-52 
partners. sy582A is recessive and fails to complement sy552. The 
truncated protein produced by lov-1 (sy 582 A) does not act as a dominant 
negative in contrast to the truncated protein produced by plov-1 .3 (see 

20 below). This difference might be due to a dosage effect of the plov-1 .3 
transgene. These results confirm that the polycystin block 2/cation 
channel domain is essential for LOV-1 activity and indicate that lov- 
1(sy582A) is completely defective in LOV-1 function. 

The lov-1 {sy582) mutant is a 1059 bp deletion of nucleotides 

25 1 8026 to 1 6968 of ZK945 (nucleotides 2267 to 1 209 of SEQ ID NO. 3). 
The deletion, which begins in exon W, removes the majority of the PKD 
homology block 2 (a total of 308 amino acids, beginning at amino acid 
2520 and ending at amino acid 2827 of the sequence set forth in SEQ ID 
NO. 4) and continues to read in-frame to the end of the sequence set 

30 forth in SEQ ID NO. 4. This results in a protein of 2870 amino acids with 
the amino acid sequence set forth in SEQ ID NO. 15. 

Other mutants may be prepared by any method known to those of 
skill in the art, including directed mutagenesis of the gene in a selected 
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nematode or random mutagenesis and selection for the altered male 
mating behavior in the lov and/or response, preferably both behaviors. 
Preferred regions for deletion include the exon A. Precise size of the 
deletion and or locations to delet can be determined empirically using 
5 standard routine methods based upon the disclosure herein, which 
identifies the gene and the resulting phenotype. Other mutations 
including insertions and point mutations that alter these behaviors are also 
contemplated and can be readily prepared. 



behaviors, the expression pattern of /ov-7-::GFP reporter genes was 
examined (see Example 2 and Fig. 4). These experiments reveal 
regulatory regions in the lov-1 gene. A partial translational fusion 
containing 2.8 kb of upstream sequence and 3.9 kb of lov-1 (plov- 

15 1 ::GFP1) directs male-specific expression in male-specific sensory 
neurons (Fig. 2c and Fig. 4). Conversely, shorter versions of plov- 
1 ::GFP1 are not expressed in the same set of male-specific neurons nor 
exclusively in male-specific sensory neurons and do not act as DNs (Fig. 
2c). Similar results were observed with pkd-2 mutants (see Example 2 

20 and Fig. 4). 

Nematode pkd-2 

A search for a homolog of LOV-1 was performed to ascertain 
whether nematodes possess a PKD2 ortholog. A BLAST search of the 
Sanger Center C. elegans genome data base revealed a possible LOV-1 

25 homolog, Y73F8A.B. This cosmid encodes a protein with 27% identity to 
PKD2 and possesses the coiied-coil domain of all polycystins. It is shown 
herein that Y73F8A.B and Y73F8A.A encode one transcript that is the C. 
elegans ortholog of human PKD2 (Fig. 2d and Fig 3). The resulting 
nematode gene, designated pkd-2, cDNA and encoded protein are 

30 provided herein. 



10 



Expression patterns of lov-1 

To elucidate the cells in which lov-1 acts to affect male mating 
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The C. elegans gene is exemplified herein. SEQ ID No. 5, which 
sets forth the complement of the coding strand, is provided. It contains 
nucleotides 1 605 to 9677 of C. elegans cosmid Y73F8A (GenBank 
Accession No. AL1 32862), which correspond to nucleotides 1 to 8073 of 
5 SEQ ID No. 5. The sequence of the encoded protein is set forth in SEQ 
ID No. 6. Figure 3B shows pkd-2 genomic structure (exons shown as 
boxes, introns as lines). The cDNA yk219e1 was sequenced and 
corresponds to the 3' end of pkd-2. 

Figure 3B shows the pkd-2 genomic structure (exons shown as 
10 boxes, introns as lines). The coding sequence in the gene set forth in 
SEQ ID No. 5 is produced as follows: 

Complement (Join (7980. ..8073) - Exon 1; (7396. ..7585) - Exon 2; 
(6765. ..7045) - Exon 3; (51 53. ..5283) - Exon 4; (4863. ..51 04) - Exon 5; 
(3931 ...4158) - Exon 6; (2875. ..3424) - Exon 7; (1 957. ..2208) - Exon 8; 
15 (1542... 1795) - Exon 9; (367. ..505) - Exon 10; (1...87) - Exon 11. 

As discussed above, the architecture of LOV-1 , including a large 
extracellular amino terminus, Block 1, and Block 2, is similar to that of 
human PKD1; the architecture and sequence of PKD-2 is similar to PKD2. 
Taken together, LOV-1 and PKD-2 appear to be part of a multi-component 
20 complex and pathway. Further genetic analysis of Lov behavior confirms 
this. 

Knockout mutation of pkd-2 

A knockout mutation can be prepared by any method known to 
those of skill in the art. A deletion mutant, designated sy606 was 

25 produced (see. Examples for primers used). A 2397 bp deletion from 
nucleotides 8338 to 5942, starting in intron 3 and ending in intron 5, 
removing exons 4 and 5 (including the partial transmembrane spanning 
domain S1 and the polycystin motif) with the new splice in a different 
reading frame resulting in a stop codon (TGA) at 5736, produced a 

30 knockout mutation. The resulting phenotype was the same as that 

resulting from a knockout of lov-1 , thereby demonstrating that the two 
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proteins are part of the same pathway that results in the observed 
phenotype. 

The pkd-2 (sy606) mutant contains a 2397 bp deletion of 
nucleotides 8338 to 5942 of Y73F8A (nucleotides 6734 to 4338 of SEQ 
5 ID NO. 5), starting in intron 3 and ending in intron 5, removing exons 4 
and 5 (including the partial transmembrane spanning domain S1 and the 
polycystin motif) with the new splice in a different reading frame. This 
results in a stop codon (TGA) at nucleotide 5728 (nucleotide 4124 in SEQ 
ID NO. 5). The sequence of the protein encoded by the pkd-2 deletion 
10 mutant (sy606) is set forth in SEQ ID NO. 16. 

TABLE 4 



Comparison of Sequence ID No. 5 with source Cosmid 



EXON 


SEQ ID 5 


Y73F8A 


1 


7980. .8073 


9584..9677 


2 


7396.-7585 


9000. .91 89 


3 


6765. .7045 


8369.-8649 


4 


5153. .5283 


6757..6887 


5 


4863. .5104 


6467. .6708 


6 


3931. .41 58 


5535.-5762 


7 


2875.-3424 


4479.. 5028 


8 


1957. .2208 


3561. .3812 


9 


1542. .1795 


3146. .3399 


10 


367. .505 


1971. .2109 


11 


1..87 


1605. .1691 



**the sy606 pkd-2 mutant has a 2397 bp deletion of nucleotides 8338 to 5942 of 
Y73F8A (GenBank Accession No. AL1 32862; nucleotides 6734 to 4338 of SEQ ID NO. 
5), starting in intron 3 and ending in intron 5, removing exons 4 and 5, with the new 
splice being in a different reading frame and resulting in a stop codon (TGA) at 
30 nucleotide 5728 (4124 in SEQ ID NO. 5). 

Other such deletions may be similarly produced by deleting any 

portion that eliminates at least one of the observed phenotypic behaviors 

associated with the Iov-7 and pkd-2 pathway. Preferable targets for 

these deletions are those that destroy reading frame resulting in non- 
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functional truncated proteins, deletions that eliminate transcriptional or 
translational control regions, deletions in the first exon or exon such that 
the deletion (or insertion or point mutation) eliminates or substantially 
attenuates activity of the encoded protein as evidenced by altered 
5 phenotype. 

The lov-1 and pkd-2 genes encode homologs of the polycystins 

It is shown herein that the lov-1 and pkd-2 genes and gene 
products are homologs of mammalian polycystins, particularly PKD1 and 
PKD2, respectively. As such nematodes that express these genes, and/or 

10 mutants of the genes can serve as models to study the expression of the 
genes, the function of these genes, to identify additional genes in the 
pathway, and for screening for compounds that will serve as lead 
compounds for treatment of PKD in mammals, particularly humans. 

Neither the precise functions of the polycystins nor the molecular 

15 basis of kidney cystogenesis is known. The results provided herein show 
that the homologs of the polycysins act together in a pathway, that 
appears to be a signal transduction pathway, in sensory neurons. It has 
been postulated that human polycystin 1 and polycystin 2 function as an 
ion channel (Torres et al. (1998) Current Opinion in Nephrology and 

20 Hypertension 7:159-169). Further supporting this contusion, are the 

results of others that have indicated that human PKD2 is associated with 
the activity of a cation channel. These results were obtained using cell- 
expression and electrophysiological approaches to examine the potential 
channel function of a protein called PCL (polycystin-like) that had been 

25 identified in the human expressed sequence-tag database by its sequence 
similarity with PKD2 (Chen et al. (1999) Nature 407:383-386). PCL was 
expressed in Xenopus oocytes by microinjecting synthetic mRNA and the 
channel properties were studied using the the two micro-electrrode 
voltage clamp and patch-clamp techniques. It was found that PCL is a 

30 non-selective cation channel that is permable to sodium, potassium and 
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calcium. It is more permeable to calcium. Thus, PCL and PKD2 may be 
cation-channel subunits. 

Hence, as shown herein, PKD1 -related proteins act as receptors 
that regulate the activity PKD2-related proteins. The two proteins are 
5 part of a conserved pathway that appears to be a signalling mechanism in 
which the translocation of ions acts as a second messenger. 
Exemplary strains 

Strains that exhibit one or more of the behaviors are provided. The 
strains may be prepared by mutagenizing wild-type or other strains with 
10 other desirable characteristics and selecting for those with the behavioral 
phenotype. 

Strain PS3152 is an N2 strain with a deletion in lov-1 {lov- 
1(sy582)) 

Strain PS2816 has the lov-1 (sy552) deletion in a background with 
15 a him-5 (high incidence of males) and plg-1, which is a mutation that 
causes the male to use a gelatinous mating plug (which can be used to 
visualize mating). 

Strain PS2817 is a paralyzed (unc-52) version of PS2816. 
Strain PS3150 has the same deletion in a background with a 
20 him-5 (high incidence of males) and ts lethal marker {pha-1). A strain 
with a ts marker is a good recipient for transformation, 
strain recipient for transformation - pha-1 marker - , any marker can be 
PS3151 is the same as PS2815 without the plg-1 
PS3149 has a pha-1 marker, in a him-5 bacground and and 
25 transforemed with an extrachromosomal element containing a lov-1 ::GFP1 
construct and pha-1 ( + ) DNA. 

Anbother strain is an him-5 strain with the lov-1 (sy582) deletion. 
PS3400 has a deletion mutation in pkd-2, it is pkd-2(sy606). 
PS3401 is a him-5 strain with the lov-1 (sy582) deletion 
30 PS3377 is pkd02(sy606) in a him-5 background. 
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These and other strains may be used in the assay methods 

described herein or in any assay that assesses the pathways and sensory 

functions which lov-1 and/or pkd-2 are involved or that can be used for 

identifying compounds that affect this pathway (s). 

5 Assays for screening compounds and for identifying mutants with 
observable Lov and/or response defective behavior 

Assays for identifying additional genes in the pathway, to assess 

the activities of proteins in the pathway, to identify regulators of gene 

expressions and factors involved in gene expression of genes in this 

10 pathway, and for screening for compounds that affect polycystin function 
are provided. Compounds that affect polycystin function in a nematode 
are candidates for further investigation and serve as leads for compounds 
that may be therapeutically useful for treating mammalian PKDs. 

Identification of components of the PKD pathway will aid in 

15 understanding the etiology of the disease and permit identication of 

disease markers and defective genes, thereby permitting development of 
reagents for diagnostic tests and identification of therapeutic targets and 
therapeutic agents. 

The assays may be adapted for high throughput methods, 

20 particularly by using multiwell plates, such as 24, 96, 384 wells or higher 
densities, and automating many of the steps. By using multiple wells, for 
example, many compounds can be screened. The results can be 
automated by using video or other recording means to record the behavior 
in each well. Viewing using such means is facilitated by visually labeling 

25 the animals, such as by introduction of reporter gene constructs that will 
be expressed in areas of interest, such as the vulval and tail region of the 
hermaphrodite, to render the animal visible to a camera. If a GFP is used, 
for example, the camera will be equipped with an appropriate filter to 
screen out all but the green glow. Other ways of making the animals 

30 visibile, include, for example, use of plg-1 animals, which leave a visible 
gelatinous trail as they move through the agar. 
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Precise protocols for culturing and nematodes, producing mutants 
and transgenics, and for observing behaviors are well known to those of 
skill in the art. 

Assays using wild-type males 
5 Behavioral screens 

In these assays males will be identified that exhibit abnormal 
behavior, particularly abnormal Lov and/or response behaviors, therby 
detecting components of PKD function, signaling or regulators, or 
identifying compounds that are candidates for afecting PKD function, 
10 signaling or regulation. A behavioral assay is depicted in Fig. 1, and 
described herein. 

The tests are performed by placing male nematodes on an agar 
surface, such as a petri dish or microtiter plate with an agar surface, that 
is seeded with anything, including bacteria or chemoattractants, such as 
15 NaCI, that will keep the males in a field of view. One or more mating 

partners, such as a hermaphrodite, is placed on the plate and the behavior 
is recorded, such as by direct observation, review of a video tape, or any 
method whereby the behavior can be recorded. 

For example, observations of the behaviors can be observed using 
20 young adult hermaphrodites, such as unc-31(e1 69) hermaphrodites, on a 
lawn of bacteria, such as E. coli. The use of unc-31 hermaphrodites, 
which are sluggish, makes it easier for males to keep pace with them. 

For drug screening assays, the effects of a test compound are 
examined. The males are treated with a compound, such as by culturing 
25 them in the presence of the compound., or including the compound in the 
mating dish, or pretreating the males with the compound. For analysis of 
mutants, males from parents or grandparents that had been 
mutatagneized with chemical and/or radiation are tested. 

In either embodiment, the behavior of the males is observed by 
30 looking for one or both, preferabl both, of the Lov and 'response' 

behaviors compared to controls, untreated males for the drug screening 
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assays or wild-type for the mutant assays. If behavior of the treated 
males differs from controls, then the compound has some activity and is 
selected for further analysis. 

For the assays of mutants, if the behavior of the males differs from 
5 the controls, the mutation(s) are identified, such as by mapping. The 
mutant gene is then identified, genetically analyzed and its role in the 
pathway elucidated. 

These methods as well as the others provided herein can be 
adapted for high throughput analysis, including automation, such by 

10 videotaping and image processing. For image processing the animals can 
be visually labeled, such as by expressing, a reporter gene, like GFP, to 
produce stable transgenic strain of some construct of GFP with any 
promoter that would direct expression with sufficient intensity or in a 
sufficient number of cells to visualize the behavior. For example, a 

15 glowing vulva and tail would permit visualization of the Lov and response 
behaviors. Suitable genes for linkage to a reporter are any that are 
expressed in the the animal to permit such visualization. Such markers 
include, but are not limited to, autofluorescence of the male spicule, egl- 
5-gfp, and of the hermaphrodite vulval region lin-1 1-gfp. 

20 Measurements can be performed by any method known to those of 

skill in the art (see, e.g., Liu eta/. (1995) Neuron 14: 79-89). Briefly, 
measurements can be are obtained as follows: time is kept with a 
stopwatch or key stroke recorder on a computer to record an 'ethogram', 
and distances estimated by eye and confirmed from microgaphs taken of 

25 the behavior. Mating behavior is sensitive to a number of variables, 

including the moisture level of the plates, which are not used if they are 
more than a week old, hermaphrodite age. Hence controls and test 
animals are carefully matched. At least three hermaphrodites are used 
per male to control for hermaphrodite specific behaviors. 
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Mating efficiency assays 

As noted above, deletion of lov-1 compromises but does not 
abolish the ability to mate. The mutant male can mate with paralyzed or 
moving impaired partners. To perform these assays, wild-type males are 
5 treated with a test compound or mutagenized, and males that sire fewer 
cross-progeny compared to wild-type or cannot sire cross-progeny with 
moving partners are identified. 

To detect whether the progeny are those of the males rather than 
the hermaphrodites, sperm defective hermaphrodites can be used. 
10 Preferably the hermaphrodites are temperature-sensitive (ts) sperm 

defective. Alternatively, the mating can be detected the mating by using 
a visual marker, such as using short and fat (Dpy;Dumpy) 
hermaphrodites, or males that express a visually or otherwise detectable 
transgene, such as fluroescent proteins (FPs), including, but not limited to 
15 blue fluroescent proteins and green fluorescent proteins (GFPs), and 

looking for the transgene in progeny could have a transgene transferred 
into the progeny by the mating and detectable. If a FP is used as a 
marker, glowing offspring are detected. 

Progeny can also be detected by measuring the density of the 
20 resulting culture and a ts sperm defective hermaphrodite. If there are lot 
of progeny, it can be inferred that the males have mated, since the 
hermaphrodite is sperm defective. 
Assays using mutant males 

Suppressor and enhancer genetics can be used to assign functions 
25 to genes, to assign genes to pathways, to identify the key switches in 

these pathways and to provide a sensitive assay to identify new genes in 
a pathway and lead compounds that modulate the activity of genes 
and/or gene products in the pathway. 
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Suppressor screen In these assays, the process starts 

with a lov-1 mutant and restoration of one or both behaviors is assessed, 
thereby identifying compounds or mutations that restore the defect. 
Restoration can occur, for example, by by-passing the defective gene, 
5 such as constitutive expression of a gene further down the pathway that 
had previoulsy required lov-1 or pkd-2 activity. Alternatively, a mutation 
could knock-out the activity of another gene that suppresses the activity 
of lov-1 or pkd-2, thereby restoring the pathway. These assays will 
identify other genes in the pathway. These assays can also identify a 

10 compound that corrects defect in the pathway, thereby providing a 
promising therapeutic lead for treatment of APKD. 

Enhancer screen In these assays, the defect is exacerbated 
by looking for mutations or compounds that increse the penetrance of the 
phenotype caused by the lov-1 or pkd-2 mutations for either or both of 

15 the 'response' and Lov defect. This is achieved by screening for males 
that cannot sire cross progeny with paralyzed hermaphrodite mating 
partners or by observing the behavior directly. The genes with mutations 
responsible for the increased penetrance that differ are identified and 
those that are not lov-1 or pkd-2 are selected. Mammalian, particularly 

20 human, homologs of the selected genes are identified, and tested to 

assess their role in PKD diseases, such as, for example, by screening PKD 
patients for alterations in the homologous (or orthologous) gene, analysis 
of mouse model knockout mutations, or other methods known to those of 
skill in the art. 

25 Assays for identifying the role of PKD proteins in sensory function 

As shown herein, lov-1 and pkd-2 are expressed in CEM neurons, 
indicating that they have activity in other sensory functions, such as 
finding a mating partner at a distance, i.e. sexual chemotaxis or kinesis, 
where the male randomly finds a hermaphrodite and then stays nearby. 
30 Hence sexual or chemoattraction assays can be used to study PKD 
function. To perform this assay, for example, put males that are 
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mutagenized or treated with a test compound on a surface containing at 

particular locations hermaphrodites and a control (i.e, males, or other 

hermaphrodites, or buffer), The proportion of fraction of males that 

choose the hemaphrodites compared to the control is scored. If the male 

5 is defective in this sensory function, it will not distinguish between males 

and hermaphrodites. 

Other sensory functions can be assessed to identify the role, if any, 

of PKD genes in the functions. 

Assays that use dominant negative forms of PKD in nematodes or 
10 in other cells to identify mutations and/or compounds that inhibit or 

otherwise alter PKD function 

Transgenic nematodes that express a version of the LOV-1 or PK2D 
protein that inhibits the activity of LOV-1 and/or PKD-2 as assessed by 
manifestation of the altered LOV and/or response phenotypic behavior(s) 

15 are used in these assays. 

As described above, a dominant negative mutation is a mutation 
that encodes a polypeptide that when expressed disrupts that activity of 
the protein encoded by the wild-type gene {see, Herskowitz (1987) 
Nature 323:219-222). A cloned gene is altered so that it encodes a 

20 mutant product that upon expression in an organism or cell containing the 
wild-type gene, expression of the wild-type product is inhibited or 
eliminated. As a result, the cell or organism is deficient in the product. 
The mutation is "dominant" because its phenotype is manifested in the 
presence of the wild-type gene, and it is "negative" in the sense that it 

25 inactivates the wild-type gene function. It is possible to do this because 
proteins have multiple functional sites. Hence an assay that identifies a 
dominant negative mutation can identify functional activities of a protein. 

In this instance, the assays use transgenic nematodes that contain 
such a dominant negative lov-1 or pkd-2 transgene. In certain assays, 

30 the transgenic mutants are mutagenized, and mutants that lose a 

remaining activity are selected. The mutuations and genes responsible for 
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the lose are identifed. Corresponding mammalian, particularly human, 
genes, such as by searching databases for homlogs or by probing libraries 
with the nematode genes, are identified. 

In the compounds screening assays that employe these transgenic 
5 nematodes, compounds that interfere with a remaining activity of the lov- 
1 or pkd-2 gene are identified. For example, as shown herein, plov-1 .3 
(plov-1.3 encodes a truncated protein lacking the polycystin block 
2/channel domain) has a dominant negative effect in transgenic 
nematodes affecting only the Lov behavior, not Response. Compounds 
10 that rescue this dominant negative effect include those that interfere with 
the synthesis, binding or function of the amino-terminal region of the 
LOV-1 protein. 

Since the dominant negative effect only affects the Lov response, a 

stable transgenic nematode strain that expresses a dominant negative of 

15 lov-1 , can be used to screen for compounds and mutations that further 

affect Response well. 

Assays based based on localization and trafficking of LOV-1 and/or 
PKD-2 within a cell or cells 

To identify regulators and factors necessary for synthesis and 
20 transport of LOV-1 and/or PKD-2 proteins, strains in which LOV-1 and 
PKD-2 are expressed linked to a detectable label, such as a fluorescent 
protein, can be and have been produced. It has been shown that these 
proteins are expressed in the ciliated endings and in the baso-dendritic 
compartment of HOB, ray neurons or CEM neurons. 
25 These strains, such as PS3149, described above, can be used to 

study the trafficking patterns of LOV-1 and PKD-2 and cellular location(s) 
of the proteins in the animal by looking for mutants thereof that have 
altered trafficking and/or altered localization of one or both of these 
proteins. The mutations can be mapped, genetically analyzed and the 
30 genes identified. Such genes could serve as therapeutic or diagnostic 
targets. 
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Assays for identification of transcriptional regulators of expression 
of lov- 1 and/or pkd-2 

To identify transcriptional regulators of lov-1 or pkd-2, a 

screen for loss or alteration of expression of either gene is provided. 

5 Transgenic nematodes with a reporter gene, such as a gene encoding a 

FP or lacZ or other detectable product, linked to the nucleic acid encoding 

lov-1 or pkd-2 is used. The animal is mutagenized or treated with a test 

compound and loss of expression or reduction in expression of either 

gene is assessed by detecting, such as by observing under a dissecting or 

10 compound microscope or other means, including whole animal sorting, 

the number of cells that express the detectable marker, such as a FP. 

As a control, to avoid detection or identification of non-specific 

effects, an unrelated gene, such as lin-3, linked to a reporter, is expressed 

in other cells in these animals. Only mutatants that exhibit changes in 

15 expression of lov-1 or pkd-2, but not expression of the other gene, are 

selected for identification and mapping of the mutation. If expression of 

the other gene is affected also, then mutation is likely affecting a general 

process and would not be of interest. 

These assays will identify regulators of and factors that affect lov-1 

20 and pkd-2 expression, which regulators and factors could serve as 

therapeutic or diagnostic targets, or which can aid in developing an 

understanding of the development and progression of PKD in mammals. 

Visual screen based on clumping behavior 

Wild type adult males isolated from hermaphrodites will clump 
25 together on a plate with a lawn of bacteria. In contrast, lov-1 and pkd-2 
mutant males do not exhibit this clumping behavior. Rather, lov-1 and 
pkd-2 mutant males are randomly dispersed in the bacterial lawn. This 
assay may be used for a variety of purposes, including, but not limited to, 
the identification of compounds that inhibit wild type male clumping 
30 behavior, compounds that restore clumping behavior to lov-1 or pkd-2 
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mutants, and the identification of genetic supressors of lov-1 or pkd-2 
mutants. 

Kits and diagnostic systems for performing the assays 

Kits for use in screening for use in any of the assays are provided. 
5 The kits include transgenic or wild-type nematodes or both that 

express either wild-type or a mutant or a transgenic form of lov-1 and/or 
pkd-2. The nematodes may be on plates, in wells or in any form suitable 
for the assays. Kits containing nucleic acid encoding either of the two 
genes, portions thereof or vectors or plasmids containg the nucleic acids 

10 or probes based upon these sequences or reporter gene constructss 

containing all or portions of either or both genes and a reporter molecule 
are also provided. The nucleic acids may be in solution, in lyophillized or 
other concentrated form, or may be bound to a suitable substrate. The 
kits can include additional reagents for performing the assays, such 

15 reagents include any for performing any of the steps of the methods. The 
kits include instructions for performing the assays. 

The kits may also include suitable ancillary reagents, such as the 
appropriate buffers and reagents. The kits may also include suitable 
ancillary supplies, such as microtiter plates, vials, calibrator solutions, 

20 controls, wash solutions and solid-phase supports. 

The kits are typically provided in packages customarily utilized in 
diagnostic assays. Such packages include glass and plastic, such as 
polyethylene, polypropylene and polycarbonate, bottles and vials, plastic 
and plastic-foil laminated envelopes and the like. The packages may also 

25 include containers appropriate for use in auto analyzers. The packages 
typically include instructions for performing the assays. 

The following examples are included for illustrative purposes only 
and are not intended to limit the scope of the invention. 
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EXAMPLE 1 

Identification of C. elegans orthologs of human polycystins 

Mating behavior and mating efficiency assays. Males were 
generated by use of him-5(el490) (high incidence of male) strains or by 
5 heatshock of L4 hermaphrodites (Brenner (1974) Genetics 77:7 '1-94). 
Mating efficiency (ME) tests were performed by pairing six tester L4 
males with six paralyzed unc-52 or four actively moving dpy-17 or N2 L4 
hermaphrodites. ME is the percentage of cross progeny to total progeny 
(Hodgkin (1983) Genetics 703:43-64). Behavioral observations were 

10 done on a 0.5 cm diameter lawn of 0P50 (Liu eta/. Neuron 74:79-89). 
Hermaphrodites (N2 or unc-31 (el 69)) were placed on a lawn with the 
tester male. Behavioral phenotypes were determined by keeping time 
with a stopwatch and manually recording the behavioral series. In one 
trial, a male is observed for a minimum of 10 vulva encounters or for 10 

15 minutes, whichever comes first. A male who does not respond to 
hermaphrodite contact within 10 minutes is considered response 
defective. Response ability reflects the percentage of males successfully 
responding to hermaphrodite contact. An individual male's vulva location 
ability was calculated as: Number of positive vulva locations/Total number 

20 of vulva encounters. Ability can vary from 100% (always locate) to 0% 
(never locate). Vulva location efficiency indicates the average behavior of 
a genotypic population. Pairwise comparisons were made using Mann- 
Whitney nonparametric and two-sided t tests (Instat for Macintosh). 
Genetic screen for location of vulva (Lov mutants). PS1 395 

25 hermaphrodites of genotype plg-1 (e2001 d); him-5(el490) were 
mutagenized with EMS (Brenner (1974) Genetics 77\1\-§A). plg- 
1(e200ld); him-5(e1490) males deposit a gelatinous plug over the 
hermaphrodite vulva post coitum. A decrease in plugging efficiency might 
reflect a decrease in mating ability. An F1 clonal screen was performed 

30 by picking individual F1 progeny of mutagenized hermaphrodites to 

individual plates and directly observing F2 males for behavioral defects. 
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An F2 clonal screen was performed such that 10 F1 progeny per P0 
hermaphrodite were picked to the same plate, 10 F2 hermaphrodites per 
F1 pool were picked to individual plates, and F3 males were observed for 
decreased plugging efficiency and/or location of vulva (Lov) defects, lov- 
5 1(sy552); plg-1 (e2001 d); him-5 is a recessive mutation isolated in the F2 
clonal screen. lov-1(sy552} males are response and Lov defective and 
also have a very low ME with dpy-17 hermaphrodites (ME-Dpy). 

Genetic mapping of lov-1 . Chromosomal linkage of lov-1 (sy552) 
was determined by scoring the loss of genetic markers relative to 

10 response, Lov, and ME-Dpy phenotypes, which revealed linkage between 
dpy-10 and sy552. Further mapping was achieved via three factor 
crosses. From sy552/unc-4(e1 20) Iet-25(mn25) heterozygotes, Unc non- 
Let (Unc for uncoordinated, Let for lethal) recombinants were picked. As 
Unc males cannot mate, a test cross with sy552 males and Unc 

15 hermaphrodites was performed to generate non-Unc sy552/(sy552k)unc- 
25(mn25) males. Males were scored for response, Lov, and ME-Dpy 
defects. 2/12 Unc non-Let recombinants segregate the lov-1 mutant 
phenotype. These data placed lov-1 between unc-4 and fet-25, closer to 
unc-4. Deficiency mapping indicated that mnDf21 uncovers sy552 

20 whereas eDf21 does not. 

Transformation rescue of lov-1 fsy 552) mutants. Cosmids and 
plasmids (15-100 ng///l) in the region from the right breakpoint of eDf21 
to the right breakpoint of mnDf21 and PHA-1 (pBX, 100 ng//y] were 
injected into lov-1 (sy552); pha-1 (e21 23ts); htm-5(e1490). Stable lines 

25 were selected at either 19° or 25°C (Schnabel et al. (1990) Science 
250:686-688). Cosmid ZK945 rescued sy552 response and vulva 
location defects in four of five stable lines. A 16.9 kb Hindi 1 1 fragment of 
ZK945 cloned into pBS(SK-i-) (plovl.1) containing ORFs ZK945.10 and 
ZK945.9 rescued sy552 behavioral defects in 4 of 6 stable lines. A 6.7 

30 kb Hindlll-BamHI fragment of ZK945 (plov-1 ::GFP1 ) containing ORF 

ZK945.10 did not rescue sy552 defects, plov-1 .3 creates a frameshift at 
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nucleotide 17724 in ZK945 inserting a BssHII GFP fragment from plasmid 
pPD95.02 out of frame into the Stul site of plov-1 .1 plov-1 .3 fails to 
rescue sy552. 

PCR screen for genomic deletion of lov-1 . Approximately 31 5,000 
5 haploid genomes were screened using primers designed to delete the 
PKD/channel domain. Primer set 1 (SEQ ID Nos. 7 and 8, respectively), 
the outside primers were: 

JC32 5'-CTCTATTTGTGGTTCGTTGGCG-3' and 
JC36 5'-GGGAGTTTCCGTTTTCATGGGG-3'; and 
10 internal nested primer set (SEQ ID Nos. 9 and 10, respectively) were: 
JC33 5'-CTAGGACCGATGCAACAGCGAG-3' and 
JC35 5'-AACGCTGATTGGTTCAAGTGTG-3') 

are approximately 2.5 and 2.4 kb apart, respectively. One deletion allele, 
lov-1 (sy582A) was isolated. DNA sequence analysis indicated a deletion 

1 5 of nucleotides 1 6972 to 1 8027 of ZK945. 

DNA-sequence analysis. RT-PCR from him-5(e1490) RNA using a 
combination of lov-1 primers generated overlapping cDNA clones bridging 
the junction between ZK945.10 and ZK945.9. Genefinder had predicted 
boundaries of the last exon of ZK945.10 (from position 25742 to 25174 

20 of ZK945) and first exon of ZK945.9 (24923 to 24444). DNA sequence 
analysis of RT-PCR generated cDNA clones revealed three exons in the 
junction: one from 25742 to 25195, a second from 25151 to 25071, and 
a third initiating a position 25021, corresponding to exons I, J, and K, in 
Fig. 2b, respectively. 

25 PCR screen for genomic deletion of pkd-2 

For pkd-2 the used primers (SEQ ID Nos. 11-14, respectively) were as 
follows: 

Outside primers 

LOV2.9 (Y73F8A nt 8546-8569) 5' CCCCTCGTTTGACCATTCTATGG 3' 
30 LOV2.10 (Y73F8A nt 8438-8457) 5' ACGTGATCCTCTGTCGATCCAG 3' 
Nested Primers 
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LOV2.9A(Y73F8A nt 5599-5615) 5' AGATCAAGCTGACTGCCCGTTC 3' 
LOV2.10A(Y73F8A nt 5609-5631) 5'GATCCAGCGATTAGCCTTTAA CG37 
One deletion allele, pkd-2(sy606) was isolated, which has a 2397 bp 
deletion from nucleotides 8338 to 5942 of Y73F8A (GenBank Accession 
5 No. AL1 32862; corresponding to nucleotides 6734 to 4338 of SEQ ID 
NO. 5). The deletion starts in intron 3 and ends in intron 5, removing 
exons 4 and 5 (including the partial transmembrane spanning domain S1 
and the polycystin motif) with the new splice in a different reading frame 
resulting in a stop codon (TGA) at 5736, produced a knockout mutation. 

10 The resulting phenotype was the same as that resulting from a knockout 
of lov-1 , thereby demonstrating that the two proteins are part of the same 
pathway that results in the observed phenotype. 

EXAMPLE 2 
Expression analyses of LOV-1 and PKD-2 

15 Methods 

GFP (see, Chalfie et al. (1994) Science 253:802-805) expression 
was used a marker for lov-1 and pkd-2 gene expression (see Figs. 3a and 
4A) plov-1::GFP1 was constructed by cloning a 6.7 kb Hind\\\-BamH\ 
fragment of plov-1.1 into the vector pPD95.81, plov-1::GFP2 by cloning a 

20 Hind\\\-Hpa\ fragment. plov-1::GFP3 and plov-1::GFP4 are Sac\ and 
Hind\\\-Hpa\ (Klenow filled-in and religated) deletions of plov-1 ::GFP1 , 
respectively, plov-1 ::GFP5 was constructed by cloning a 15.4 kb Hind\\\- 
Afe\ fragment of plov-1 .1 into the Hind\\\-Sma\ site of pPD95.79. ppkd- 
2.1, ppkd-2::gfp1 and ppkd-2::gfp2 were constructed by cloning PCR- 

25 amplified 8.9 kb, 2.0 kb and 5.9 kb fragments into the vectors 

pPD95.97, pPD95.75 and pPD95.77, respectively. Transgenic animals 
were observed by fluorescence microscopy Cells were identified by 
comparing Nomarski and fluorescent or confocal images of the same 
animals to determine cell-body position (Sulston et al. (1980) Dev. Biol. 

30 78:542-576). HOB assignment was confirmed by laser ablation of 
precursor cells. 
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lov-1 expression 

lov-1 ::GFP1 is specifically expressed in male-sensory neurons, 
including four putative chemosensory CEM cephalic neurons, the hook 
neuron HOB {Fig. 4a), and the sensory ray neurons (Fig. 4b). lov-l::GFP1 
5 expression was first observed in a few cells during late L4 lethargus (data 
not shown) while strong expression peaks in the adult male. In neuronal 
cell bodies, GFP expression is cytoplasmic (non-nuclear) and punctate 
(Fig. 4a and Fig. 4b). lov-1 ::GFP1 is localized at high levels in the cell 
body and ciliated endings of CEM (Fig. 4c), HOB, and ray neurons (Fig. 

10 4b) but is not observed in axons. Localization of lov-1 ::GFP1 to sensory 
endings is consistent with plasma membrane localization and strengthens 
the argument that lov-1 mediates sensory perception required for mating 
behaviors. The temporal and spatial regulation of lov-1 is concordant 
with its role in adult male mating behavior. Rays mediate response to 

15 contact with a hermaphrodite (Liu et al. Neuron 74:79-89), the hook 

mediates vulva location (Liu et al. Neuron 74:79-89), and the CEMs are 
postulated to play a role in chemosensation (Ward et al. (1975) J. Comp. 
Neurol. 760:313-337). 

lov-1 ::GFP1 expression was unaltered in lov-1 (sy552) mutants. 

20 Expression of this fusion gene did not rescue lov-1 (sy552) defects (Fig. 
2a) and is therefore not functional. Sensory neurons and structures are 
normal in lov-1 (sy552) mutants as determined by osm-6::gfp expression, 
dye filling of sensory neurons, Nomarski observation, and SEM imaging 
(data not shown). The defects of lov-1 (sy552J mutants therefore cannot 

25 be attributed to abnormal development or differentiation of the response 
and vulva location neurons. This indicats hat lov-1 (sy552) defects are 
due to defects in the function of the cells required for response and vulva 
location. 

The Lov defect of mutations in lov-1 is not identical to ablation of 
30 HOB, the chemosensory neuron in which lov- 1 expressed. The/ov-7 

mutant and HOB-ablated males pass the vulva (Fig. 1). The lov-1 males. 
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however, are capable of precisely locating the vulva, whereas HOB- 
ablated males resort to slow search. Therefore, the HOB neuron of lov-1 
functions, albeit in an attenuated capacity. If lov-1 (sy552) and lov- 
1(sy582A) are loss of function alleles as the data suggests, then 
5 additional components are involved in Lov sensation. 

Chemosensation and mechanosensation are likely involved in Lov 
C elegans sensory neurons can be polymodal: for example, by 
ultrastructural assignment, the ASH neuron appears to be chemosensory 
yet functions in both mechanosensory (nose touch) and chemosensory 

10 (osmotic avoidance) modalities (Kaplan eta/. (1993) Proc. Natl. Acad. 

Sci. U.S.A. 50:2227-2231). HOB might similarly be a polymodal sensory 
neuron. Ablation of either HOA or HOB produces identical phenotypes 
(Liu et al. Neuron 74:79-89) and HOA and HOB form multiple chemical 
synapses and electrical junctions (Sulston et al. (1980) Dev. Biol. 78:54-2- 

15 576), indicating extensive cross talk between the two hook sensory 

neurons. Since LOV-1 has an extensive extracellular mucin-like domain 
that could be involved in cell-cell or cell-matrix interaction, binding of 
vulva cell ligand(s) might potentially gate the LOV-1 polycystin-related 
channel. Another possibility is that LOV-1 could physically link the HOB 

20 sensory endings to the scherotized hook structure and couple hook 

deflection by the hermaphrodite vulva to intracellular voltage-activated 
signaling similar to hair cell mechanosensation (Hudspeth (1989) Nature 
347:397-404) or touch response in C. elegans (Driscoll et al. in C. 
elegans II (ed. Riddle, D.I., Blumenthal, T., Meyer, B.J., and Priess, J.R.) 

25 645-677 (Cold Spring Harbor Laboratory Press, New York, 1997). 
pkd-2 expression 

As shown herein, C. elegans genome contains a human PKD-2 
homlog. PKD-2 possesses six membrane-spanning domains, a positively 
charged foruth membrane-spanning segment, a pore region, and the 
30 coiled coil domain of all polysystins. PKD-2 is localized to the same male- 
specific sensory neuorons as LOV-1 (see, Fig. 3 and Fig. 4). 



Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited only by the scope of the appended 
claims. 
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SEQUENCE LISTING SUMMARY 

SEQ ID No. 1 cDNA encoding human PKD1 

SEQ ID No. 2 encoded human PKD1 protein 

SEQ ID No. 3 sequence of a gene encoding nematode LOV-1 protein 
5 SEQ ID No. 4 encoded nematode LOV-1 protein 

SEQ ID No. 5 sequence of a gene encoding a nematode PKD-2 protein 

SEQ ID No. 6 encoded nematode PKD-2 protein 

SEQ ID No. 7 primer for lov-1 deletion mutant construction 

SEQ ID No 8 primer for lov-1 deletion mutant construction 

10 SEQ ID No. 9 internal primer for lov-1 deletion mutant construction 
SEQ ID No. 10 internal primer for lov-1 deletion mutant construction 
SEQ ID No. 1 1 primer for pk2-1 deletion mutant construction 
SEQ ID No. 12 primer for pk2-1 deletion mutant construction 
SEQ ID No. 13 internal primer for pk2-1 deletion mutant construction 

15 SEQ ID No. 14 internal primer for pk2-1 deletion mutant construction 
SEQ ID No. 15 sets forth the a LOV-1 mutant protein from sy582 
SEQ ID No. 16 sets a PKD-2 mutant protein from sy606 
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CLAIMS: 

1. An isolated nucleic acid molecule, comprising: 

a) a sequence of nucleotides that encodes the sequence of 
amino acids encoded by one or more of the exons that is the complement 

5 of the sequence of nucleotides set forth in SEQ ID No. 3; or 

b) the sequence of nucleotides set forth as one or more of 
the exons that are the complement of the sequence of nucleotides set 
forth in SEQ ID No. 3; 

c) a sequence of nucleotides that hybridizes along its full 
10 length to the full length of at least one of the exons set forth in SEQ ID 

No. 3 under conditions of at least moderate stringency, and that is 
present it the genome of a nematode; or 

d) a sequence of nucleotides degenerate with the sequence 
of nucleotides of c). 

15 2. An isolated nucleic acid molecule, comprising: 

a) a sequence of nucleotides that encodes the sequence of 
amino acids encoded by one or more of the exons that is the complement 
of the sequence of nucleotides set forth in SEQ ID No. 5; or 

b) the sequence of nucleotides set forth as one or more of 
20 the exons that is the complement of the sequence of nucleotides set forth 

in SEQ ID No. in SEQ ID No. 5; 

c) a sequence of nucleotides that hybridizes along its full 
length to the full length of at least one of the exons of SEQ ID No. 5 
under conditions of at least moderate stringency, and that is present in 

25 the genome of a nematode; or 

d) a sequence of nucleotides degenerate with the sequence 
of nucleotides of c). 

3. An isolated nucleic acid molecule of claim 1, that encodes 
LOV-1 protein from a nematode. 
30 4. An isolated nucleic acid molecule of claim 2, that encodes a 

PKD-2 protein from a nematode. 



5. The isolated molecule of claim 1 that comprises a sequence 
of nucleotides that encodes the amino acids set forth in SEQ ID No. 4. 

6. The isolated molecule of claim 2 that comprises a sequence 
of nucleotides that encodes the amino acids set forth in SEQ ID No. 6. 

5 7. The isolated nucleic acid molecule of claim 1, wherein the 

nematode is Caenorhabditis elegans. 

8. The isolated nucleic acid molecule of claim 2, wherein the 
nematode is Caenorhabditis elegans. 

9. An isolated gene, comprising the nucleic acid molecule of 
10 claim 1 . 

10. The gene of claim 9, wherein the gene comprises 
transcriptional control sequences that are homologous to the encoded 
gene. 

1 1 . The gene of claim 9, wherein the gene comprises 

15 transcriptional control sequences that are heterologous to the encoded 
gene. 

12. An isolated gene, comprising the nucleic acid molecule of 
claim 2. 

13. The gene of claim 12, wherein the gene comprises 

20 transcriptional control sequences that are homologous to the encoded 
gene. 

14. The gene of claim 12, wherein the gene comprises 
transcriptional control sequences that are heterologous to the encoded 
gene. 

25 15. An isolated nucleic acid molecule that encodes a mutant of 

the protein encoded by the nucleic acid molecule of claim 3. 

16. The nucleic acid molecule of claim 15, wherein the mutant is 
a deletion mutant, insertional mutant or comprises a point mutation. 

17. The nucleic acid molecule of claim 15, wherein the encoded 
30 protein is inactive. 
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18. An isolated nucleic acid molecule that encodes a mutant of 
the protein encoded by the nucleic acid molecule of claim 4. 

19. The nucleic acid molecule of claim 18, wherein the mutant is 
a deletion mutant, insertional mutant or comprises a point mutation. 

5 17. The nucleic acid molecule of claim 18, wherein the encoded 

protein is inactive. 

18. A construct, comprising a nucleic acid molecule of claim 1 
operatively linked to a reporter gene. 

19. The construct of claim 18, wherein the reporter gene 
10 encodes a fluorescent protein. 

20. A construct, comprising a nucleic acid molecule of claim 2 
operatively linked to a reporter gene. 

21 . The construct of claim 20, wherein the reporter gene 
encodes a fluorescent protein. 

15 22. A plasmid, comprising a nucleic acid molecule of claim 1. 

23. The plasmid of claim 22 that is an expression vector. 

24. A transgenic nematode, comprising the vector of claim 23. 

25. The transgenic nematode of claim 24, wherein in the vector 
is maintained extrachromsomally. 

20 26. The transgenic nematode of claim 24, wherein in the vector 

or a gene-encoding portion is integrated into the C. elegans genome. 

27. The transgenic nematode of claim 24, wherein the vector 

further comprises nucleic acid encoding a reporter gene operatively linked 

to the nucleic acid molecule. 
25 28. The transgenic nematode of claim 24, wherein the nucleic 

acid molecule encodes a mutant protein. 

29. The transgenic nematode of claim 27, wherein the nucleic 
acid molecule encodes a mutant protein. 

30. A plasmid, comprising a nucleic acid molecule of claim 2. 
30 31 . The plasmid of claim 30 that is an expression vector. 

32. A transgenic nematode, comprising the vector of claim 31 . 
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33. The transgenic nematode of claim 32, wherein in the vector 
is maintained extrachromosomally. 

34. The transgenic nematode of claim 32, wherein in the vector 
or the gene-encoding portion is integrated into the C. elegans genome. 

5 35. The transgenic nematode of claim 32, wherein the vector 

further comprises nucleic acid encoding a reporter gene operatively linked 
to the nucleic acid molecule. 

36. The transgenic nematode of claim 32, wherein the nucleic 
acid molecule encodes a mutant protein. 
10 37. The transgenic nematode of claim 35, wherein the nucleic 

acid molecule encodes a mutant protein. 

38. An isolated nucleic acid molecule, comprising a sequence of 
nucleotides encoding a mutant LOV-1 protein, wherein a nematode that 
expresses such defect exhibits one or both of an altered location of vulva 

15 (Lov) and response phenotype, and the LOV-1 protein is encoded by the 
nucleic acid molecule of claim 1 . 

39. A transgenic nematode, comprising the nucleic acid 
molecule of claim 38. 

40. An isolated nucleic acid molecule, comprising a sequence of 
20 nucleotides encoding a mutant PKD-2 protein, wherein a nematode that 

expresses such defect exhibits one or both of an altered Lov and 
response phenotype, and the PKD-2 protein is encoded by the nucleic 
acid molecule of claim 2. 

41 . A trangenic nematode, comprising the nucleic acid molecule 
25 of claim 40. 

42. An isolated polypeptide encoded by the nucleic acid 
molecule of claim 1 . 

43. The polypeptide of claim 42 that comprises the sequence of 
amino acids set forth in SEQ ID No. 4. 

30 44. An isolated polypeptide encoded by the nucleic acid 

molecule of claim 2. 
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45. The polypeptide of claim 44 that comprises the sequence of 
amino acids set forth in SEQ ID No. 6. 

46. An isolated nucleic acid molecule of claim 19, comprising a 
sequence of nucleotides that encodes the sequence of amino acids set 

5 forth in SEQ ID No. 15. 

47. An isolated complex comprising a nematode PKD-2 protein 
and a nematode LOV-1 protein in operative linkage. 

48. A method, comprising: 

introducing a mutation into the lov-1 and/or pkd-2 gene of a 
10 nematode, and 

selecting nematodes that exhibit altered mating behavior, wherein 
the altered behavior includes a change in the ability to locate the vulva 
(Lov) of a hermaphrodite or a change in the response of the male to 
contact with the hermaphrodite (Response). 
15 49. The method of claim 48, wherein the altered behavior is a 

change in the response of the male to contact with the hermaphrodite. 

50. The method of claim 48, wherein the mutation is in the lov-1 

gene. 

51 . The method of claim 48, wherein the mutation is in the 
20 pkd-2 gene. 

52. The method of claim 48, wherein the nematode is a species 
of Caenorhabditis. 

53. A method, comprising: 

treating nematodes with a test compound or with a 
25 mutagenizing agent or treatment; and 

selecting from among the nematodes or offspring thereof, 
nematodes that exhibit altered mating behavior compared to prior to the 
treatment; where the altered behavior includes one or both of location of 
vulva (Lov) or response of the male to contact with the hermaphrodite 
30 (Response). 
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54. The method of claim 53, wherein prior to treatment the 
nematodes had exhibited normal mating behavior. 

55. The method of claim 53, wherein prior to treatment the 
nematodes had exhibited defects in mating behavior, wherein the defects 

5 were manifested as a defect in one or both of Lov and Response, and the 
alteration comprises a partial restoration or complete restoration of one or 
both of Lov and Response behaviors. 

56. A method for identifying compounds, comprising: 
contacting nematodes with a test compound; 

10 selecting test compounds that result in altered mating behavior, 

wherein: 

the altered mating behavior comprises alteration in the behavior 
involving location of vulva and/or response to contact with the 
hermaphrodite; and 
15 the selected test compounds are candidates for treatment of 

polycystic kidney diseases of mammals. 

57. The method of claim 56, wherein prior to treatment the 
nematodes had exhibited normal mating behavior. 

58. The method of claim 56, wherein prior to treatment the 

20 nematodes had exhibited defects in mating behavior, wherein the defects 
were manifested as a defect in one or both of Lov and Response, and the 
alteration comprises a partial restoration or complete restoration of one or 
both of Lov and Response behaviors. 

59. The method of claim 56, wherein the selected compounds 
25 are candidate therapeutic agents for treatment of autosomal dominant 

polycystic kidney disease (ADPKD) or other diseases involving PKD1 or 
PKD2. 

60. The method of claim 59, wherein prior to treatment the 
nematodes had defects in mating behavior, and the candidate compounds 

30 restore or partially restore either or both Lov and Response. 
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61 . A method for identifying genes that are part of the disease 
pathway of autosomal dominant polycystic kidney disease (ADPKD), 
comprising: 

mutagenizing nematodes that exhibit normal mating behavior; and 
5 identifying and selecting nematodes or the male offspring thereof 

that exhibit altered mating behavior, wherein the altered mating behavior 
comprises alteration in the behavior involving location of vulva (LOV) 
and/or response to contact with the hermaphrodite (Response), thereby 
identifying nematodes that contain defects in genes in the pathway that 
10 comprises the lov-1 and/or pkd-2 gene(s). 

62. The method of claim 61, further comprising, mapping the 
mutation(s) in selected nematodes that results in the altered behavior. 

63. The method of claim 62, further comprising, identifying 
mammalian homologs or orthologs of the nematode genes to which the 

15 mutation is mapped. 

64. A method for identifying compounds that are candidate 
therapeutic agents for treatment of autosomal dominant polycystic kidney 
disease (ADPKD), comprising: 

treating male nematodes that can sire cross-progeny with moving 
20 partners with a test compound; and 

selecting compounds that result in males that sire fewer cross 
progeny or cannot sire cross-progeny with moving partners, wherein the 
selected compounds are candidate therapeutic agents for treatment of 
ADPKD or diseases involving PKD1 or PKD2. 
25 65. A method for identifying genes that are part of the disease 

pathway of autosomal dominant polycystic kidney disease (ADPKD), 
comprising: 

mutagenizing males nematodes that can sire cross-progeny with 
moving partners with a test compound; 
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selecting males or the offspring thereof that sire fewer cross- 
progeny with moving partners; and 

identifying the mutant nematode genes. 

66. A method for identifying genes or regulatory factors involved 
5 in polycystic kidney diseases, comprising: 

mutagenizing nematodes that exhibit altered mating behaviors 
because of a mutation in the lov-1 or pkd-2 gene; 

selecting nematodes or the offspring thereof that exhibit a 
restoration of the behavior associated with the wild-type gene; and 
10 identifying a second gene other than lov-1 or pkd-2 or a factor that 

results in restoration of the behavior, wherein restoration of the behavior 
is a partial or complete restoration compared to prior to mutagenesis. 

67. The method of 66, further comprising: 

identifying a mammalian gene that is orthologous to the second 

15 gene. 

68. A method for screening compounds to identify candidates for 
treatment of polycystic kidney diseases, comprising: 

contacting nematodes that exhibit altered mating behaviors 
because of a mutation in the lov-1 or pkd-2 gene with a test compound; 
20 and 

selecting compounds that result in restoration of the behavior, 
wherein restoration of the behavior is a partial or complete restoration 
compared to prior to contacting. 

69. A method for identifying genes or regulatory factors involved 
25 in polycystic kidney diseases, comprising: 

mutagenizing nematodes that exhibit altered mating behaviors 
because of a mutation in the lov-1 or pkd-2 gene; 

selecting nematodes or offspring thereof that cannot sire cross 
progeny or sire fewer cross progeny with paralyzed hermaphrodite mating 
30 partners; and 
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identifying a gene responsible for the inability to sire cross progeny 
with paralyzed hermaphrodite mating partners. 

70. The method of claim 69, further comprising identifying 
mammalian homologs of the gene responsible for the inability to sire cross 

5 progeny with paralyzed hermaphrodite mating partners. 

71. A method for identifying genes or regulatory factors involved 
in polycystic kidney diseases, comprising: 

mutagenizing transgenic nematodes that contain a dominant 
negative lov-1 or pkd-2 transgene; 
10 selecting nematodes or offspring thereof that exhibit a further loss 

in function of the lov-1 or pkd-2 transgene by observing mating 
behaviors; and 

identifying the mutations and genes responsible for the loss. 

72. The method of claim 71, further comprising identifying 
15 homologous mammalian genes. 

73. A method for identifying regulators and factors necessary for 
synthesis and transport of LOV-1 or PKD-2 protein; 

preparing a transgenic nematode that expresses a detectable 
marker linked to LOV-1 or PKD-2 protein; 
20 mutagenizing the nematode; 

selecting nematodes or offspring thereof that have altered patterns 
of expression of LOV-1 or PKD-2; and 

identifying the gene responsible for the alteration. 

74. A method for identifying transcriptional regulators of lov-1 or 
25 pkd-2; comprising: 

preparing a transgenic nematode that expresses a detectable 
marker linked to LOV-1 or PKD-2 protein; 
mutagenizing the nematode; 

selecting nematodes or offspring thereof that altered levels of 
30 expression of the protein. 

75. A method, comprising: 
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treating nematodes with a test compound or mutagenizing 

them; 

selecting nematodes or the offspring thereof that exhibit 
altered clumping behavior when seeded on a lawn of bacteria, wherein: 
5 an alteration in the behavior is indicative of change in the genotype 

of the lov-1 or pkd-2 locus; 

the wild-type males exhibit clumping behavior, and a males with a 
mutation in either locus that alters activity of either the LOV-1 or PKD-2 
protein results in males that are randomly dispersed in the bacterial lawn. 
10 76. The method of claim 75, wherein: 

the nematodes are mutant nematodes that are randomly dispersed 
in the bacterial lawn and are treated with a test compound; and the 
method further comprises: 

identifying compounds that restore or partially restore clumping 
15 behavior. 

77. The method of claim 76, wherein the mutant nematodes 
comprise males that are lov-1 mutants. 

78. The method of claim 76, wherein the mutant nematodes 
comprise males that are pkd-2 mutants. 

20 79. The method of claim 75, wherein: 

the nematodes are mutant nematodes that are randomly dispersed 
in the bacterial lawn and then mutagenized; and the method further 
comprises: 

selecting males or the offspring thereof that exhibit a partial or 
25 complete restoration of the behavior; 

analyzing the mutations; and 

identifying the genes or mutations responsible for the restoration. 

80. The method of claim 76, wherein the genes or mutations are 
genetic supressors of lov-1 or pkd-2 mutants. 
30 81 . The method of claim 76, wherein the mutant nematodes 

comprise males that are lov-1 mutants. 
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82. The method of claim 76, wherein the mutant nematodes 
comprise males that are pkd-2 mutants. 

83. The method of claim 75, wherein: 

the nematodes are wild-type nematodes that are clumped in the 
5 bacterial lawn and are treated with a test compound; and the method 
further comprises: 

identifying compounds that destroy the clumping behavior. 

84. The method of claim 75, wherein: 

the nematodes are wild-type nematodes that are clumped in the 
10 bacterial lawn and then mutagenized; and the method further comprises: 
selecting males or the offspring there of that are randomly 
dispersed on the bacterial lawn; 

analyzing mutations responsible for the altered behavior; and 
identifying the mutant genes. 
15 85. A mutant strain of nematode that comprises a mutation in 

the lov-1 or pkd-2 gene, whereby the resulting nematode exhibits altered 
mating behavior compared to the wild-type, wherein the alteration is 
manifested as either or both a defect in behavior involving location of 
vulva <LOV) and response to contact with the hermaphrodite (Response). 
20 86. The mutant strain of claim 85, wherein the mutation is in the 

lov-1 gene, wherein the wild-type lov-1 gene comprises: 

a) a sequence of nucleotides that encodes the sequence of 
amino acids encoded by one or more of the exons that is the complement 
of the sequence of nucleotides set forth in SEQ ID No. 3; or 
25 b) the sequence of nucleotides set forth as one or more of 

the exons that are the complement of the sequence of nucleotides set 
forth in SEQ ID No. 3; 

c) a sequence of nucleotides that hybridizes along its full 
length to the full length of at least one of the exons set forth in SEQ ID 
30 No. 3 under conditions of at least moderate stringency, and that is 
present it the genome of a nematode; or 
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d) a sequence of nucleotides degenerate with the sequence 
of nucleotides of c). 

87. The mutant strain of claim 85, wherein the mutation is in the 
pkd-2 gene, wherein the wild-type pkd-2 gene comprises: 

5 a) a sequence of nucleotides that encodes the sequence of 

amino acids encoded by one or more of the exons that is the complement 
of the sequence of nucleotides set forth in SEQ ID No. 5; or 

b) the sequence of nucleotides set forth as one or more of 
the exons that is the complement of the sequence of nucleotides set forth 

10 in SEQ ID No. in SEQ ID No. 5; 

c) a sequence of nucleotides that hybridizes along its full 
length to the full length of at least one of the exons of SEQ ID No. 5 
under conditions of at least moderate stringency, and that is present in 
the genome of a nematode; or 

15 d) a sequence of nucleotides degenerate with the sequence 

of nucleotides of c). 

88. The method of claim 65, further comprising identifying 
mammalian homologs of the genes that comprise the mutant nematode 
genes. 
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ABSTRACT 

Nematodes, such as Caenorhabditis elegans, that express mutant 
and wild-type orthologs of human genes involved in polycystic kidney 
diseases (PKDs), are used to study the functions of the proteins encoded 
5 by the genes, to screen for other genes involved in the diseases, to 

identify mutations involved in the diseases, and to screen for drugs that 
affect PKD. Behaviors controlled by the action of the genes or gene 
products are identified and used in the assays. Hence an animal model is 
provided that permits study of the etiology of polycystic kidney disease 

10 and provides a tool to identify the genes involved in the disease pathway, 
and to identify compounds that may be used to treat or alter the disease 
progression, lessen its severity or ameliorate symptoms. The nematode 
genes that encode protein products, mutants of the genes, vectors 
contain the genes and mutant genes and nematode strains that contain 

15 the vectors are also provided. 



-71- 



1016.. \ 



intact 

approaches vulva 
stops at vulva 



hook ablated 

approaches vulva 



passes vulva 



inserts spicules and transfers sperm circles hermaphrodite circles hermaphrodite 





lov-l(sy552) 

approaches vulva 




initiates a slow search for the vulva using 
the p.c-s. and spicules (t=300s) 




inserts spicules and transfers sperm 



Figure 2 

A. lov-1(sy552) rescue data 

0.1 map unit 



unc-4 



genetic map 



hv-1 

let-268 daf-19 
I I 



Rescue 
lov-1(sy552) 



mec-15 
!_ 



mnPf66, mnDf29, mnDf57 t mnDf63 



ePf21 



mnPf21 



physical map 

J cosmids: 



s 

'd8D1 



ZK945 



F27E5 



- (0/2) 
4- (4/5) 

- (0/3) 



ZK945.9 



\ 

2K945.10 



H -<— 



plov-1. 1 I 
plov-1 2. — 
plov-1::GFP1 
plov-1 .3 | 



Slul 

(premature stop) 



+ (6/8) 

- (0/3) 

- (0/2) 

- (0/2) 



=> 




1 PCD 


= > 




Si 




9 




□ 


□ 




a- 










1^ 


|?s 




o 


o] 

-n 1 


ol 




m 

T3 


T) 1 


-n 



3 

□ 



3 "D 

«d a 
<= o 

1^ 
-Si- o 



5 V 
»3 
OS. 
m cd 



I i 



an 



lov-1 genomic structure and GFP fusions 

HIJ K 

ED 

jllonnnrTimiin niinnL 

2045.10^ ' ZK945-! 



«"■ : ', !.* • RT-PCR confirmed exons 
, , predicted exons 



Behavioural 
phenqtypes 
lov-1(sy552) wild type 
324 rescue 



Expression pattern 



Jl 



i nnnfH 



: lov-'1::gfp1 
iov-1::gfp2 
■ 1 lov-i::gfp3 . 

lov T 1::gfp4 
v - hv-1,3 

' lov-1 ::gfp5 

[ pkd-2 genomic structure and GFP fusions 

i sequenced cDNAyk219e1 U 111] II U 

i ^exons fl \\ \] f\ f] fill (10 M 



■ (6/8) 



■ (0/2) 



~H G K_ 

nnnR 
i«iffinnRrTTnmiiBB _ 

partial 
CC 

iiinnmnrnimrfflTTTlR 



Y73F8A.b 



Y73F8A.a 



n n n n 



\tov-1 and pkd-2 genomic structures, constructs-, rescue data and expression 
?The line above the tov-1 gene indicates the 1 .055-bp deletion in tov-1(sy582A). 




■ (0/2) 
- (0/3) 



wild type 



male-specific 
sensory neurons: 
CEMs, HOB, rays 



faint, nonspecific 
not expressed 



male-specific 
sensory neurons: 
CEMs, HOB, rays 



male-specific 

sensory neurons: 

CEMs. HOB, rays 

nonspecific, ring neurons (weak) 

male-specific 
sensory neurons: 
CEMs. HOB. rays 



Subcellular 
localization 



Basodendritic 
(Cell body. 



Uniform 
(cell body, 
axon, dendrite 



Numbers in parentheses indicate the ratio of rescuing stable lines to the numbe 
stable lines examined. DN, dominant negative. 



figure<fL0V-1 ::6FP1 and PKD-2::GFP2 are colocalized to adult male sens, 
cell bodies and dendrites. The spicules, hook structure and postenormost fa 
autofluoresce. Arrows, neuronal cell bodies; arrowheads, dendrites or dliate*. 
Images (merged DIC and fluorescence) were obtained using confocal micros 
a-c, kw-1::gfp1. a, HOB and ray cell bodies (arrows), HOB dendritic process (e 
b. HOB and ray process 5 (arrowheads), c, Ciliated endings in nose tip from m; 
cephaiic CEM neurons (cell bodies not shown), d-f, pkd-1::gfpZ d, Ray eel 
(arrow) and ray process 2 (arrowhead), e, Ray process 5 (arrowhead), f , Mai 
cephalic CEM ciliated endings (arrow). Scale bar, 20 \an. 




SEQUENCE LISTING 

<110> Sternberg, Paul W. 

Barr, Maureen M. 

<12 0> POLYCYSTIC KIDNEY DISEASE GENE HOMOLOGS REQUIRED FOR MALE MATING 

BEHAVIOR IN NEMATODES AND ASSAYS BASED THEREON 

<130> 18021-2901 

<14 0> Unas signed 

<141> 2000-01-06 

<150> 60/115,127 

<151> 1999-01-06 

<160> 16 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 12912 
<212> DNA 

<213> Homo sapiens PKD-1 gene 

<220> 
<221> CDS 
<222> (1) . . (12912) 

<400> 1 

atg ccg ccc gcc gcg ccc gcc 
Met Pro Pro Ala Ala Pro Ala 
1 5 

ctg tgg etc ggg gcg ctg gcg 
Leu Trp Leu Gly Ala Leu Ala 
20 

tgc gag ccc ccc tgc etc tgc 
Cys Glu Pro Pro Cys Leu Cys 
35 

gtc aac tgc teg ggc cgc ggg 
Val Asn Cys Ser Gly Arg Gly 
50 55 

ate ccc gcg gac gcc aca gag 
lie Pro Ala Asp Ala Thr Glu 
65 70 

gcg ctg gac gtt ggg etc ctg 
Ala Leu Asp Val Gly Leu Leu 
85 

gat ata age aac aac aag att 
Asp lie Ser Asn Asn Lys lie 
100 

aat tta ttt aat tta agt gaa 
Asn Leu Phe Asn Leu Ser Glu 
115 

tgt gac tgt ggc ctg gcg tgg 
Cys Asp Cys Gly Leu Ala Trp 
130 135 



cgc ctg gcg ctg gcc ctg ggc ctg ggc 
Arg Leu Ala Leu Ala Leu Gly Leu Gly 



999 ggg ccc ggg cgc ggc tgc ggg ccc 
Gly Gly Pro Gly Arg Gly Cys Gly Pro 



ggg cca gcg ccc ggc gcc gcc tgc cgc 
Gly Pro Ala Pro Gly Ala Ala Cys Arg 



ctg egg acg etc ggt ccc gcg ctg cgc 
Leu Arg Thr Leu Gly Pro Ala Leu Arg 



eta gac gtc tec cac aac ctg etc egg 
Leu Asp Val Ser His Asn Leu Leu Arg 



tct acg tta gaa gaa gga ata ttt get 
Ser Thr Leu Glu Glu Gly lie Phe Ala 
105 110 

ata aac ctg agt ggg aac ccg ttt gag 
lie Asn Leu Ser Gly Asn Pro Phe Glu 
120 125 

ctg ccg caa tgg gcg gag gag cag cag 
Leu Pro Gin Trp Ala Glu Glu Gin Gin 
140 



gtg egg gtg gtg cag ccc gag gca gec acg tgt get ggg cct ggc tec 480 
Val Arg Val Val Gin Pro Glu Ala Ala Thr Cys Ala Gly Pro Gly Ser 
145 150 155 160 

ctg get ggc cag cct ctg ctt ggc ate ccc ttg ctg gac agt ggc tgt 52 8 
Leu Ala Gly Gin Pro Leu Leu Gly lie Pro Leu Leu Asp Ser Gly Cys 
165 170 175 

ggt gag gag tat gtc gec tgc etc cct gac aac age tea ggc ace gtg 576 
Gly Glu Glu Tyr Val Ala Cys Leu Pro Asp Asn Ser Ser Gly Thr Val 
180 185 190 

gca gca gtg tec ttt tea get gec cac gaa ggc ctg ctt cag cca gag 624 
Ala Ala Val Ser Phe Ser Ala Ala His Glu Gly Leu Leu Gin Pro Glu 
195 200 205 

gec tgc age gec ttc tgc ttc tec acc ggc cag ggc etc gca gee etc 672 
Ala Cys Ser Ala Phe Cys Phe Ser Thr Gly Gin Gly Leu Ala Ala Leu 
210 215 220 

teg gag cag ggc tgg tgc ctg tgt ggg gcg gee cag ccc tec agt gec 720 
Ser Glu Gin Gly Trp Cys Leu Cys Gly Ala Ala Gin Pro Ser Ser Ala 
225 230 235 240 

tec ttt gec tgc ctg tec etc tgc tec ggg ccc ccg gca cct cct gee 768 
Ser Phe Ala Cys Leu Ser Leu Cys Ser Gly Pro Pro Ala Pro Pro Ala 
245 250 255 

ccc acc tgt agg ggc ccc acc etc etc cag cac gtc ttc cct gee tec 816 
Pro Thr Cys Arg Gly Pro Thr Leu Leu Gin His Val Phe Pro Ala Ser 
260 265 270 

cca ggg gec acc ctg gtg ggg ccc cac gga cct ctg gee tct ggc cag 864 
Pro Gly Ala Thr Leu Val Gly Pro His Gly Pro Leu Ala Ser Gly Gin 
275 280 285 

eta gca gee ttc cac ate get gee ccg etc cct gtc act gac aca cgc 912 
Leu Ala Ala Phe His lie Ala Ala Pro Leu Pro Val Thr Asp Thr Arg 
290 295 300 

tgg gac ttc gga gac ggc tec gec gag gtg gat gec get ggg ccg get 96 0 
Trp Asp Phe Gly Asp Gly Ser Ala Glu Val Asp Ala Ala Gly Pro Ala 
305 310 315 320 

gec teg cat cgc tat gtg ctg cct ggg cgc tat cac gtg acg gee gtg 1008 
Ala Ser His Arg Tyr Val Leu Pro Gly Arg Tyr His Val Thr Ala Val 
325 330 335 

ctg gee ctg ggg gee ggc tea gec ctg ctg ggg aca gac gtg cag gtg 1056 
Leu Ala Leu Gly Ala Gly Ser Ala Leu Leu Gly Thr Asp Val Gin Val 
340 345 350 

gaa gcg gca cct gec gee ctg gag etc gtg tgc ccg tec teg gtg cag 1104 
Glu Ala Ala Pro Ala Ala Leu Glu Leu Val Cys Pro Ser Ser Val Gin 
355 360 365 

agt gac gag age etc gac etc age ate cag aac cgc ggt ggt tea ggc 1152 
Ser Asp Glu Ser Leu Asp Leu Ser lie Gin Asn Arg Gly Gly Ser Gly 
370 375 380 

ctg gag gec gee tac age ate gtg gee ctg ggc gag gag ccg gec cga 12 00 
Leu Glu Ala Ala Tyr Ser lie Val Ala Leu Gly Glu Glu Pro Ala Arg 
385 390 395 400 

gcg gtg cac ccg etc tgc ccc teg gac acg gag ate ttc cct ggc aac 1248 
Ala Val His Pro Leu Cys Pro Ser Asp Thr Glu lie Phe Pro Gly Asn 
405 410 415 



ggg cac tgc tac cgc ctg gtg gtg gag aag gcg gcc tgg ctg cag gcg 
Gly His Cys Tyr Arg Leu Val Val Glu Lys Ala Ala Trp Leu Gin Ala 
420 425 430 

cag gag cag tgt cag gcc tgg gcc ggg gcc gcc ctg gca atg gtg gac 
Gin Glu Gin Cys Gin Ala Trp Ala Gly Ala Ala Leu Ala Met Val Asp 
435 440 445 

agt ccc gcc gtg cag cgc ttc ctg gtc tec egg gtc acc agg age eta 
Ser Pro Ala Val Gin Arg Phe Leu Val Ser Arg Val Thr Arg Ser Leu 
450 455 460 

gac gtg tgg ate ggc ttc teg act gtg cag ggg gtg gag gtg ggc cca 
Asp Val Trp lie Gly Phe Ser Thr Val Gin Gly Val Glu Val Gly Pro 
465 470 475 480 

gcg ccg cag ggc gag gcc ttc age ctg gag age tgc cag aac tgg ctg 
Ala Pro Gin Gly Glu Ala Phe Ser Leu Glu Ser Cys Gin Asn Trp Leu 
485 490 495 

ccc ggg gag cca cac cca gcc aca gcc gag cac tgc gtc egg etc ggg 
Pro Gly Glu Pro His Pro Ala Thr Ala Glu His Cys Val Arg Leu Gly 
500 505 510 

ccc acc ggg tgg tgt aac acc gac ctg tgc tea gcg ccg cac age tac 
Pro Thr Gly Trp Cys Asn Thr Asp Leu Cys Ser Ala Pro His Ser Tyr 
515 520 525 

gtc tgc gag ctg cag ccc gga ggc cca gtg cag gat gcc gag aac etc 
Val Cys Glu Leu Gin Pro Gly Gly Pro Val Gin Asp Ala Glu Asn Leu 
530 535 540 

etc gtg gga gcg ccc agt ggg gac ctg cag gga ccc ctg acg cct ctg 
Leu Val Gly Ala Pro Ser Gly Asp Leu Gin Gly Pro Leu Thr Pro Leu 
545 550 555 560 

gca cag cag gac ggc etc tea gcc ccg cac gag ccc gtg gag gtc atg 
Ala Gin Gin Asp Gly Leu Ser Ala Pro His Glu Pro Val Glu Val Met 
565 570 575 

gta ttc ccg ggc ctg cgt ctg age cgt gaa gcc ttc etc acc acg gcc 
Val Phe Pro Gly Leu Arg Leu Ser Arg Glu Ala Phe Leu Thr Thr Ala 
580 585 590 

gaa ttt ggg acc cag gag etc egg egg ccc gcc cag ctg egg ctg cag 
Glu Phe Gly Thr Gin Glu Leu Arg Arg Pro Ala Gin Leu Arg Leu Gin 
595 600 605 

gtg tac egg etc etc age aca gca ggg acc ccg gag aac ggc age gag 
Val Tyr Arg Leu Leu Ser Thr Ala Gly Thr Pro Glu Asn Gly Ser Glu 
610 615 620 

cct gag age agg tec ccg gac aac agg acc cag ctg gcc ccc gcg tgc 
Pro Glu Ser Arg Ser Pro Asp Asn Arg Thr Gin Leu Ala Pro Ala Cys 
625 630 635 640 

atg cca ggg gga cgc tgg tgc cct gga gcc aac ate tgc ttg ccg ctg 
Met Pro Gly Gly Arg Trp Cys Pro Gly Ala Asn lie Cys Leu Pro Leu 
645 650 655 

gac gcc tec tgc cac ccc cag gcc tgc gcc aat ggc tgc acg tea ggg 
Asp Ala Ser Cys His Pro Gin Ala Cys Ala Asn Gly Cys Thr Ser Gly 
660 665 670 

cca ggg eta ccc ggg gcc ccc tat gcg eta tgg aga gag ttc etc ttc 
Pro Gly Leu Pro Gly Ala Pro Tyr Ala Leu Trp Arg Glu Phe Leu Phe 



tec gtt ccc gcg ggg ccc ccc gcg cag tac teg gtc acc etc cac ggc 2112 
Ser Val Pro Ala Gly Pro Pro Ala Gin Tyr Ser Val Thr Leu His Gly 
690 695 700 

cag gat gtc etc atg etc cct ggt gac etc gtt ggc ttg cag cac gac 2160 
Gin Asp Val Leu Met Leu Pro Gly Asp Leu Val Gly Leu Gin His Asp 
705 710 715 720 

get ggc cct ggc gee etc ctg cac tgc teg ccg get ccc ggc cac cct 220 8 
Ala Gly Pro Gly Ala Leu Leu His Cys Ser Pro Ala Pro Gly His Pro 
725 730 735 

ggt ccc egg gee ccg tac etc tec gee aac gee teg tea tgg ctg ccc 2256 
Gly Pro Arg Ala Pro Tyr Leu Ser Ala Asn Ala Ser Ser Trp Leu Pro 
740 745 750 

cac ttg cca gec cag ctg gag ggc act tgg ggc tgc cct gee tgt gee 2304 
His Leu Pro Ala Gin Leu Glu Gly Thr Trp Gly Cys Pro Ala Cys Ala 
755 760 765 

ctg egg ctg ctt gca caa egg gaa cag etc acc gtg ctg ctg ggc ttg 2352 
Leu Arg Leu Leu Ala Gin Arg Glu Gin Leu Thr Val Leu Leu Gly Leu 
770 775 780 

agg ccc aac cct gga ctg egg ctg cct ggg cgc tat gag gtc egg gca 2400 
Arg Pro Asn Pro Gly Leu Arg Leu Pro Gly Arg Tyr Glu Val Arg Ala 
785 790 795 800 

gag gtg ggc aat ggc gtg tec agg cac aac etc tec tgc age ttt gac 2448 
Glu Val Gly Asn Gly Val Ser Arg His Asn Leu Ser Cys Ser Phe Asp 
805 810 815 

gtg gtc tec cca gtg get ggg ctg egg gtc ate tac cct gec ccc cgc 2496 
Val Val Ser Pro Val Ala Gly Leu Arg Val lie Tyr Pro Ala Pro Arg 
820 825 830 

gac ggc cgc etc tac gtg ccc acc aac ggc tea gec ttg gtg etc cag 2544 
Asp Gly Arg Leu Tyr Val Pro Thr Asn Gly Ser Ala Leu Val Leu Gin 
835 840 845 

gtg gac tct ggt gec aac gec acg gec acg get cgc tgg cct ggg ggc 2592 
Val Asp Ser Gly Ala Asn Ala Thr Ala Thr Ala Arg Trp Pro Gly Gly 
850 855 860 

agt etc age gec cgc ttt gag aat gtc tgc cct gee ctg gtg gee acc 2 640 
Ser Leu Ser Ala Arg Phe Glu Asn Val Cys Pro Ala Leu Val Ala Thr 
865 870 875 880 

ttc gtg ccc gec tgc ccc tgg gag acc aac gat acc ctg ttc tea gtg 2688 
Phe Val Pro Ala Cys Pro Trp Glu Thr Asn Asp Thr Leu Phe Ser Val 
885 890 895 

gta gca ctg ccg tgg etc agt gag ggg gag cac gtg gtg gac gtg gtg 2736 
Val Ala Leu Pro Trp Leu Ser Glu Gly Glu His Val Val Asp Val Val 
900 905 910 

gtg gaa aac age gec age egg gec aac etc age ctg egg gtg acg gcg 2784 
Val Glu Asn Ser Ala Ser Arg Ala Asn Leu Ser Leu Arg Val Thr Ala 
915 920 925 

gag gag ccc ate tgt ggc etc cgc gee acg ccc age ccc gag gee cgt 2 832 
Glu Glu Pro lie Cys Gly Leu Arg Ala Thr Pro Ser Pro Glu Ala Arg 
930 935 940 

gta ctg cag gga gtc eta gtg agg tac age ccc gtg gtg gag gee ggc 2 880 
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teg gac atg gtc ttc egg tgg acc ate aac gac aag cag tec ctg ace 
Ser Asp Met Val Phe Arg Trp Thr lie Asn Asp Lys Gin Ser Leu Thr 
965 970 975 

ttc cag aac gtg gtc ttc aat gtc att tat cag age gcg gcg gtc ttc 
Phe Gin Asn Val Val Phe Asn Val lie Tyr Gin Ser Ala Ala Val Phe 
980 985 990 

aag etc tea ctg acg gee tec aac cac gtg age aac gtc acc gtg aac 
Lys Leu Ser Leu Thr Ala Ser Asn His Val Ser Asn Val Thr Val Asn 
995 1000 1005 

tac aac gta acc gtg gag egg atg aac agg atg cag ggt ctg cag gtc 
Tyr Asn Val Thr Val Glu Arg Met Asn Arg Met Gin Gly Leu Gin Val 
1010 1015 1020 

tec aca gtg ccg gee gtg ctg tec ccc aat gec acg eta gca ctg acg 
Ser Thr Val Pro Ala Val Leu Ser Pro Asn Ala Thr Leu Ala Leu Thr 
1025 1030 1035 1040 

gcg ggc gtg ctg gtg gac teg gec gtg gag gtg gec ttc ctg tgg acc 
Ala Gly Val Leu Val Asp Ser Ala Val Glu Val Ala Phe Leu Trp Thr 
1045 1050 1055 

ttt ggg gat ggg gag cag gec etc cac cag ttc cag cct ccg tac aac 
Phe Gly Asp Gly Glu Gin Ala Leu His Gin Phe Gin Pro Pro Tyr Asn 
10S0 1065 1070 

gag tec ttc cca gtt cca gac ccc teg gtg gee cag gtg ctg gtg gag 
Glu Ser Phe Pro Val Pro Asp Pro Ser Val Ala Gin Val Leu Val Glu 
1075 1080 1085 

cac aat gtc acg cac acc tac get gec cca ggt gag tac etc ctg acc 
His Asn Val Thr His Thr Tyr Ala Ala Pro Gly Glu Tyr Leu Leu Thr 
1090 1095 1100 

gtg ctg gca tct aat gec ttc gag aac ctg acg cag cag gtg cct gtg 
Val Leu Ala Ser Asn Ala Phe Glu Asn Leu Thr Gin Gin Val Pro Val 
1105 1110 1115 1120 

age gtg cgc gec tec ctg ccc tec gtg get gtg ggt gtg agt gac ggc 
Ser Val Arg Ala Ser Leu Pro Ser Val Ala Val Gly Val Ser Asp Gly 
1125 1130 1135 

gtc ctg gtg gec ggc egg ccc gtc acc ttc tac ccg cac ccg ctg ccc 
Val Leu Val Ala Gly Arg Pro Val Thr Phe Tyr Pro His Pro Leu Pro 
1140 1145 1150 

teg cct ggg ggt gtt ctt tac acg tgg gac ttc ggg gac ggc tec cct 
Ser Pro Gly Gly Val Leu Tyr Thr Trp Asp Phe Gly Asp Gly Ser Pro 
1155 1160 1165 

gtc ctg acc cag age cag ccg get gee aac cac acc tat gec teg agg 
Val Leu Thr Gin Ser Gin Pro Ala Ala Asn His Thr Tyr Ala Ser Arg 
1170 1175 1180 

ggc acc tac cac gtg cgc ctg gag gtc aac aac acg gtg age ggt gcg 
Gly Thr Tyr His Val Arg Leu Glu Val Asn Asn Thr Val Ser Gly Ala 
1185 1190 1195 1200 

gcg gec cag gcg gat gtg cgc gtc ttt gag gag etc cgc gga etc age 
Ala Ala Gin Ala Asp Val Arg Val Phe Glu Glu Leu Arg Gly Leu Ser 
1205 1210 1215 
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gtg gac atg age ctg gec gtg gag cag ggc gec ccc gtg gtg gtc age 
Val Asp Met Ser Leu Ala Val Glu Gin Gly Ala Pro Val Val Val Ser 
1220 1225 1230 

gec gcg gtg cag acg ggc gac aac ate acg tgg acc ttc gac atg ggg 
Ala Ala Val Gin Thr Gly Asp Asn lie Thr Trp Thr Phe Asp Met Gly 
1235 1240 1245 

gac ggc acc gtg ctg teg ggc ccg gag gca aca gtg gag cat gtg tac 
Asp Gly Thr Val Leu Ser Gly Pro Glu Ala Thr Val Glu His Val Tyr 
1250 1255 1260 

ctg egg gca cag aac tgc aca gtg acc gtg ggt gcg ggc age ccc gec 
Leu Arg Ala Gin Asn Cys Thr Val Thr Val Gly Ala Gly Ser Pro Ala 
1265 1270 1275 1280 

ggc cac ctg gec egg age ctg cac gtg ctg gtc ttc gtc ctg gag gtg 
Gly His Leu Ala Arg Ser Leu His Val Leu Val Phe Val Leu Glu Val 
1285 1290 1295 

ctg cgc gtt gaa ccc gec gec tgc ate ccc acg cag cct gac gcg egg 
Leu Arg Val Glu Pro Ala Ala Cys lie Pro Thr Gin Pro Asp Ala Arg 
1300 1305 1310 

etc acg gec tac gtc acc ggg aac ccg gee cac tac etc ttc gac tgg 
Leu Thr Ala Tyr Val Thr Gly Asn Pro Ala His Tyr Leu Phe Asp Trp 
1315 1320 1325 

acc ttc ggg gat ggc tec tec aac acg acc gtg egg ggg tgc ccg acg 
Thr Phe Gly Asp Gly Ser Ser Asn Thr Thr Val Arg Gly Cys Pro Thr 
1330 1335 1340 

gtg aca cac aac ttc acg egg age ggc acg ttc ccc ctg gcg ctg gtg 
Val Thr His Asn Phe Thr Arg Ser Gly Thr Phe Pro Leu Ala Leu Val 
1345 1350 1355 1360 

ctg tec age cgc gtg aac agg gcg cat tac ttc acc age ate tgc gtg 
Leu Ser Ser Arg Val Asn Arg Ala His Tyr Phe Thr Ser lie Cys Val 
1365 1370 1375 

gag cca gag gtg ggc aac gtc acc ctg cag cca gag agg cag ttt gtg 
Glu Pro Glu Val Gly Asn Val Thr Leu Gin Pro Glu Arg Gin Phe Val 
1380 1385 1390 

cag etc ggg gac gag gec tgg ctg gtg gca tgt gec tgg ccc ccg ttc 
Gin Leu Gly Asp Glu Ala Trp Leu Val Ala Cys Ala Trp Pro Pro Phe 
1395 1400 1405 

ccc tac cgc tac acc tgg gac ttt ggc acc gag gaa gee gec ccc acc 
Pro Tyr Arg Tyr Thr Trp Asp Phe Gly Thr Glu Glu Ala Ala Pro Thr 
1410 1415 1420 

cgt gec agg ggc cct gag gtg acg ttc ate tac cga gac cca ggc tec 
Arg Ala Arg Gly Pro Glu Val Thr Phe lie Tyr Arg Asp Pro Gly Ser 
1425 1430 1435 1440 

tat ctt gtg aca gtc acc gcg tec aac aac ate tct get gec aat gac 
Tyr Leu Val Thr Val Thr Ala Ser Asn Asn lie Ser Ala Ala Asn Asp 
1445 1450 1455 

tea gec ctg gtg gag gtg cag gag ccc gtg ctg gtc acc age ate aag 
Ser Ala Leu Val Glu Val Gin Glu Pro Val Leu Val Thr Ser lie Lys 
1460 1465 1470 

gtc aat ggc tec ctt ggg ctg gag ctg cag cag ccg tac ctg ttc tct 
Val Asn Gly Ser Leu Gly Leu Glu Leu Gin Gin Pro Tyr Leu Phe Ser 
1475 1480 1485 
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get gtg ggc cgt ggg cgc ccc gec age tac ctg tgg gat ctg ggg gac 
Ala Val Gly Arg Gly Arg Pro Ala Ser Tyr Leu Trp Asp Leu Gly Asp 
1490 1495 1500 

ggt ggg tgg etc gag ggt ccg gag gtc acc cac get tac aac age aca 
Gly Gly Trp Leu Glu Gly Pro Glu Val Thr His Ala Tyr Asn Ser Thr 
1505 1510 1515 1520 

ggt gac ttc acc gtt agg gtg gee ggc tgg aat gag gtg age cgc age 
Gly Asp Phe Thr Val Arg Val Ala Gly Trp Asn Glu Val Ser Arg Ser 
1525 1530 1535 

gag gec tgg etc aat gtg acg gtg aag egg cgc gtg egg ggg etc gtc 
Glu Ala Trp Leu Asn Val Thr Val Lys Arg Arg Val Arg Gly Leu Val 
1540 1545 1550 

gtc aat gca age cgc acg gtg gtg ccc ctg aat ggg age gtg age ttc 
Val Asn Ala Ser Arg Thr Val Val Pro Leu Asn Gly Ser Val Ser Phe 
1555 1560 1565 

age acg teg ctg gag gee ggc agt gat gtg cgc tat tec tgg gtg etc 
Ser Thr Ser Leu Glu Ala Gly Ser Asp Val Arg Tyr Ser Trp Val Leu 
1570 1575 1580 

tgt gac cgc tgc acg ccc ate cct ggg ggt cct acc ate tct tac acc 
Cys Asp Arg Cys Thr Pro lie Pro Gly Gly Pro Thr lie Ser Tyr Thr 
1585 1590 1595 1600 

ttc cgc tec gtg ggc acc ttc aat ate ate gtc acg get gag aac gag 
Phe Arg Ser Val Gly Thr Phe Asn lie lie Val Thr Ala Glu Asn Glu 
1605 1610 1615 

gtg ggc tec gee cag gac age ate ttc gtc tat gtc ctg cag etc ata 
Val Gly Ser Ala Gin Asp Ser lie Phe Val Tyr Val Leu Gin Leu lie 
1620 1625 1630 

gag ggg ctg cag gtg gtg ggc ggt ggc cgc tac ttc ccc acc aac cac 
Glu Gly Leu Gin Val Val Gly Gly Gly Arg Tyr Phe Pro Thr Asn His 
1635 1640 1645 

acg gta cag ctg cag gee gtg gtt agg gat ggc acc aac gtc tec tac 
Thr Val Gin Leu Gin Ala Val Val Arg Asp Gly Thr Asn Val Ser Tyr 
1650 1655 1660 

age tgg act gec tgg agg gac agg ggc ccg gee ctg gec ggc age ggc 
Ser Trp Thr Ala Trp Arg Asp Arg Gly Pro Ala Leu Ala Gly Ser Gly 
1665 1670 1675 1680 

aaa ggc ttc teg etc acc gtg etc gag gec ggc acc tac cat gtg cag 
Lys Gly Phe Ser Leu Thr Val Leu Glu Ala Gly Thr Tyr His Val Gin 
1685 1690 1695 

ctg egg gec acc aac atg ctg ggc age gec tgg gee gac tgc acc atg 
Leu Arg Ala Thr Asn Met Leu Gly Ser Ala Trp Ala Asp Cys Thr Met 
1700 1705 1710 

gac ttc gtg gag cct gtg ggg tgg ctg atg gtg gec gee tec ccg aac 
Asp Phe Val Glu Pro Val Gly Trp Leu Met Val Ala Ala Ser Pro Asn 
1715 1720 1725 

cca get gee gtc aac aca age gtc acc etc agt gec gag ctg get ggt 
Pro Ala Ala Val Asn Thr Ser Val Thr Leu Ser Ala Glu Leu Ala Gly 
1730 1735 1740 

ggc agt ggt gtc gta tac act tgg tec ttg gag gag ggg ctg age tgg 
Gly Ser Gly Val Val Tyr Thr Trp Ser Leu Glu Glu Gly Leu Ser Trp 
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gag acc tec gag cca ttt acc acc cat age ttc ccc aca ccc ggc ctg 
Glu Thr Ser Glu Pro Phe Thr Thr His Ser Phe Pro Thr Pro Gly Leu 
1765 1770 1775 

cac ttg gtc acc atg acg gca ggg aac ccg ctg ggc tea gec aac gec 
His Leu Val Thr Met Thr Ala Gly Asn Pro Leu Gly Ser Ala Asn Ala 
1780 1785 1790 

acc gtg gaa gtg gat gtg cag gtg cct gtg agt ggc etc age ate agg 
Thr Val Glu Val Asp Val Gin Val Pro Val Ser Gly Leu Ser lie Arg 
1795 1800 1805 

gee age gag ccc gga ggc age ttc gtg gcg gec ggg tec tct gtg ccc 
Ala Ser Glu Pro Gly Gly Ser Phe Val Ala Ala Gly Ser Ser Val Pro 
1810 1815 1820 

ttt tgg ggg cag ctg gec acg ggc acc aat gtg age tgg tgc tgg get 
Phe Trp Gly Gin Leu Ala Thr Gly Thr Asn Val Ser Trp Cys Trp Ala 
1825 1830 1835 1840 

gtg ccc ggc ggc age age aag cgt ggc cct cat gtc acc atg gtc ttc 
Val Pro Gly Gly Ser Ser Lys Arg Gly Pro His Val Thr Met Val Phe 
1845 1850 1855 

ccg gat get ggc acc ttc tec ate egg etc aat gee tec aac gca gtc 
Pro Asp Ala Gly Thr Phe Ser lie Arg Leu Asn Ala Ser Asn Ala Val 
1860 1865 1870 

age tgg gtc tea gec acg tac aac etc acg gcg gag gag ccc ate gtg 
Ser Trp Val Ser Ala Thr Tyr Asn Leu Thr Ala Glu Glu Pro lie Val 
1875 1880 1885 

ggc ctg gtg ctg tgg gec age age aag gtg gtg gcg ccc ggg cag ctg 
Gly Leu Val Leu Trp Ala Ser Ser Lys Val Val Ala Pro Gly Gin Leu 
1890 1895 1900 

gtc cat ttt cag ate ctg ctg get gec ggc tea get gtc acc ttc cgc 
Val His Phe Gin He Leu Leu Ala Ala Gly Ser Ala Val Thr Phe Arg 
1905 1910 1915 1920 

eta cag gtc ggc ggg gee aac ccc gag gtg etc ccc ggg ccc cgt ttc 
Leu Gin Val Gly Gly Ala Asn Pro Glu Val Leu Pro Gly Pro Arg Phe 
1925 1930 1935 

tec cac age ttc ccc cgc gtc gga gac cac gtg gtg age gtg egg ggc 
Ser His Ser Phe Pro Arg Val Gly Asp His Val Val Ser Val Arg Gly 
1940 1945 1950 

aaa aac cac gtg age tgg gec cag gcg cag gtg cgc ate gtg gtg ctg 
Lys Asn His Val Ser Trp Ala Gin Ala Gin Val Arg He Val Val Leu 
1955 1960 1965 

gag gee gtg agt ggg ctg cag gtg ccc aac tgc tgc gag cct ggc ate 
Glu Ala Val Ser Gly Leu Gin Val Pro Asn Cys Cys Glu Pro Gly He 
1970 1975 1980 

gec acg ggc act gag agg aac ttc aca gec cgc gtg cag cgc ggc tct 
Ala Thr Gly Thr Glu Arg Asn Phe Thr Ala Arg Val Gin Arg Gly Ser 
1985 1990 1995 2000 

egg gtc gec tac gec tgg tac ttc teg ctg cag aag gtc cag ggc gac 
Arg Val Ala Tyr Ala Trp Tyr Phe Ser Leu Gin Lys Val Gin Gly Asp 
2005 2010 2015 

teg ctg gtc ate ctg teg ggc cgc gac gtc acc tac acg ccc gtg gee 
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Ser Leu Val lie Leu Ser Gly Arg Asp Val Thr Tyr Thr Pro Val Ala 
2020 2025 2030 

gcg ggg ctg ttg gag ate cag gtg cgc gec ttc aac gec ctg ggc agt 6144 
Ala Gly Leu Leu Glu lie Gin Val Arg Ala Phe Asn Ala Leu Gly Ser 
2035 2040 2045 

gag aac cgc acg ctg gtg ctg gag gtt cag gac gec gtc cag tat gtg 6192 
Glu Asn Arg Thr Leu Val Leu Glu Val Gin Asp Ala Val Gin Tyr Val 
2050 2055 2060 

gec ctg cag age ggc ccc tgc ttc acc aac cgc teg gcg cag ttt gag 6240 
Ala Leu Gin Ser Gly Pro Cys Phe Thr Asn Arg Ser Ala Gin Phe Glu 
2065 2070 2075 2080 

gec gec acc age ccc age ccc egg cgt gtg gec tac cac tgg gac ttt 6288 
Ala Ala Thr Ser Pro Ser Pro Arg Arg Val Ala Tyr His Trp Asp Phe 
2085 2090 2095 

ggg gat ggg teg cca ggg cag gac aca gat gag ccc agg gec gag cac 6336 
Gly Asp Gly Ser Pro Gly Gin Asp Thr Asp Glu Pro Arg Ala Glu His 
2100 2105 2110 

tec tac ctg agg cct ggg gac tac cgc gtg cag gtg aac gec tec aac 6384 
Ser Tyr Leu Arg Pro Gly Asp Tyr Arg Val Gin Val Asn Ala Ser Asn 
2115 2120 2125 

ctg gtg age ttc ttc gtg gcg cag gec acg gtg acc gtc cag gtg ctg 6432 
Leu Val Ser Phe Phe Val Ala Gin Ala Thr Val Thr Val Gin Val Leu 
2130 2135 2140 

gee tgc egg gag ccg gag gtg gac gtg gtc ctg ccc ctg cag gtg ctg 6480 
Ala Cys Arg Glu Pro Glu Val Asp Val Val Leu Pro Leu Gin Val Leu 
2145 2150 2155 2160 

atg egg cga tea cag cgc aac tac ttg gag gee cac gtt gac ctg cgc 6528 
Met Arg Arg Ser Gin Arg Asn Tyr Leu Glu Ala His Val Asp Leu Arg 
2165 2170 2175 

gac tgc gtc acc tac cag act gag tac cgc tgg gag gtg tat cgc acc 65 76 
Asp Cys Val Thr Tyr Gin Thr Glu Tyr Arg Trp Glu Val Tyr Arg Thr 
2180 2185 2190 

gee age tgc cag egg ccg ggg cgc cca gcg cgt gtg gee ctg ccc ggc 6624 
Ala Ser Cys Gin Arg Pro Gly Arg Pro Ala Arg Val Ala Leu Pro Gly 
2195 2200 2205 

gtg gac gtg age egg cct egg ctg gtg ctg ccg egg ctg gcg ctg cct 6672 
Val- Asp Val Ser Arg Pro Arg Leu Val Leu Pro Arg Leu Ala Leu Pro 
2210 2215 2220 

gtg ggg cac tac tgc ttt gtg ttt gtc gtg tea ttt ggg gac acg cca 6720 
Val Gly His Tyr Cys Phe Val Phe Val Val Ser Phe Gly Asp Thr Pro 
2225 2230 2235 2240 

ctg aca cag age ate cag gec aat gtg acg gtg gee ccc gag cgc ctg 67 68 
Leu Thr Gin Ser lie Gin Ala Asn Val Thr Val Ala Pro Glu Arg Leu 
2245 2250 2255 



gtg ccc ate att gag ggt ggc tea tac cgc gtg tgg tea gac aca egg 6816 
Val Pro lie lie Glu Gly Gly Ser Tyr Arg Val Trp Ser Asp Thr Arg 
2260 2265 2270 

gac ctg gtg ctg gat ggg age gag tec tac gac ccc aac ctg gag gac 6864 
Asp Leu Val Leu Asp Gly Ser Glu Ser Tyr Asp Pro Asn Leu Glu Asp 
2275 2280 2285 
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ggc gac cag acg ccg etc agt ttc cac tgg gec tgt gtg get teg aca 
Gly Asp Gin Thr Pro Leu Ser Phe His Trp Ala Cys Val Ala Ser Thr 
2290 2295 2300 

cag agg gag get ggc ggg tgt gcg ctg aac ttt ggg ccc cgc ggg age 
Gin Arg Glu Ala Gly Gly Cys Ala Leu Asn Phe Gly Pro Arg Gly Ser 
2305 2310 2315 2320 

age acg gtc acc att cca egg gag egg ctg gcg get ggc gtg gag tac 
Ser Thr Val Thr lie Pro Arg Glu Arg Leu Ala Ala Gly Val Glu Tyr 
2325 2330 2335 

acc ttc age ctg acc gtg tgg aag gec ggc cgc aag gag gag gee acc 
Thr Phe Ser Leu Thr Val Trp Lys Ala Gly Arg Lys Glu Glu Ala Thr 
2340 2345 2350 

aac cag acg gtg ctg ate egg agt ggc egg gtg ccc att gtg tec ttg 
Asn Gin Thr Val Leu lie Arg Ser Gly Arg Val Pro lie Val Ser Leu 
2355 2360 2365 

gag tgt gtg tec tgc aag gca cag gee gtg tac gaa gtg age cgc age 
Glu Cys Val Ser Cys Lys Ala Gin Ala Val Tyr Glu Val Ser Arg Ser 
2370 2375 2380 

tec tac gtg tac ttg gag ggc cgc tgc etc aat tgc age age ggc tec 
Ser Tyr Val Tyr Leu Glu Gly Arg Cys Leu Asn Cys Ser Ser Gly Ser 
2385 2390 2395 2400 

aag cga ggg egg tgg get gca cgt acg ttc age aac aag acg ctg gtg 
Lys Arg Gly Arg Trp Ala Ala Arg Thr Phe Ser Asn Lys Thr Leu Val 
2405 2410 2415 

ctg gat gag acc acc aca tec acg ggc agt gca ggc atg cga ctg gtg 
Leu Asp Glu Thr Thr Thr Ser Thr Gly Ser Ala Gly Met Arg Leu Val 
2420 2425 2430 

ctg egg egg ggc gtg ctg egg gac ggc gag gga tac acc ttc acg etc 
Leu Arg Arg Gly Val Leu Arg Asp Gly Glu Gly Tyr Thr Phe Thr Leu 
2435 2440 2445 

acg gtg ctg ggc cgc tct ggc gag gag gag ggc tgc gee tec ate cgc 
Thr Val Leu Gly Arg Ser Gly Glu Glu Glu Gly Cys Ala Ser lie Arg 
2450 2455 2460 

ctg tec ccc aac cgc ccg ccg ctg ggg ggc tct tgc cgc etc ttc cca 
Leu Ser Pro Asn Arg Pro Pro Leu Gly Gly Ser Cys Arg Leu Phe Pro 
2465 2470 2475 2480 

ctg ggc get gtg cac gee etc acc acc aag gtg cac ttc gaa tgc acg 
Leu Gly Ala Val His Ala Leu Thr Thr Lys Val His Phe Glu Cys Thr 
2485 2490 2495 

ggc tgg cat gac gcg gag gat get ggc gee ccg ctg gtg tac gec ctg 
Gly Trp His Asp Ala Glu Asp Ala Gly Ala Pro Leu Val Tyr Ala Leu 
2500 2505 2510 

ctg ctg egg cgc tgt cgc cag ggc cac tgc gag gag ttc tgt gtc tac 
Leu Leu Arg Arg Cys Arg Gin Gly His Cys Glu Glu Phe Cys Val Tyr 
2515 2520 2525 

aag ggc age etc tec age tac gga gec gtg ctg ccc ccg ggt ttc agg 
Lys Gly Ser Leu Ser Ser Tyr Gly Ala Val Leu Pro Pro Gly Phe Arg 
2530 2535 2540 

cca cac ttc gag gtg ggc ctg gee gtg gtg gtg cag gac cag ctg gga 
Pro His Phe Glu Val Gly Leu Ala Val Val Val Gin Asp Gin Leu Gly 
2545 2550 2555 2560 
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gcc get gtg gtc gec etc aac agg tct ttg gec ate acc etc cca gag 7728 
Ala Ala Val Val Ala Leu Asn Arg Ser Leu Ala lie Thr Leu Pro Glu 
2565 2570 2575 

ccc aac ggc age gca acg ggg etc aca gtc tgg ctg cac ggg etc acc 777 6 
Pro Asn Gly Ser Ala Thr Gly Leu Thr Val Trp Leu His Gly Leu Thr 
2580 2585 2590 

get agt gtg etc cca ggg ctg ctg egg cag gcc gat ccc cag cac gtc 7824 
Ala Ser Val Leu Pro Gly Leu Leu Arg Gin Ala Asp Pro Gin His Val 
2595 2600 2605 

ate gag tac teg ttg gcc ctg gtc acc gtg ctg aac gag tac gag egg 7872 
He Glu Tyr Ser Leu Ala Leu Val Thr Val Leu Asn Glu Tyr Glu Arg 
2610 2615 2620 

gcc ctg gac gtg gcg gca gag ccc aag cac gag egg cag cac cga gcc 7 92 0 
Ala Leu Asp Val Ala Ala Glu Pro Lys His Glu Arg Gin His Arg Ala 
2625 2630 2635 2640 

cag ata cgc aag aac ate acg gag act ctg gtg tec ctg agg gtc cac 7968 
Gin He Arg Lys Asn He Thr Glu Thr Leu Val Ser Leu Arg Val His 
2645 2650 2655 

act gtg gat gac ate cag cag ate get get gcg ctg gcc cag tgc atg 8016 
Thr Val Asp Asp He Gin Gin He Ala Ala Ala Leu Ala Gin Cys Met 
2660 2665 2670 

ggg ccc age agg gag etc gta tgc cgc teg tgc ctg aag cag acg ctg 8064 
Gly Pro Ser Arg Glu Leu Val Cys Arg Ser Cys Leu Lys Gin Thr Leu 
2675 2680 2685 

cac aag ctg gag gcc atg atg etc ate ctg cag gca gag acc acc gcg 8112 
His Lys Leu Glu Ala Met Met Leu He Leu Gin Ala Glu Thr Thr Ala 
2690 2695 2700 

ggc acc gtg acg ccc acc gcc ate gga gac age ate etc aac ate aca 8160 
Gly Thr Val Thr Pro Thr Ala He Gly Asp Ser He Leu Asn He Thr 
2705 2710 2715 2720 

gga gac etc ate cac ctg gcc age teg gac gtg egg gca cca cag ccc 82 08 
Gly Asp Leu He His Leu Ala Ser Ser Asp Val 'Arg Ala Pro Gin Pro 
2725 2730 2735 

tea gag ctg gga gcc gag tea cca tct egg atg gtg gcg tec cag gcc 8256 
Ser Glu Leu Gly Ala Glu Ser Pro Ser Arg Met Val Ala Ser Gin Ala 
2740 2745 2750 

tac aac ctg acc tct gcc etc atg cgc ate etc atg cgc tec cgc gtg 8304 
Tyr Asn Leu Thr Ser Ala Leu Met Arg He Leu Met Arg Ser Arg Val 
2755 2760 2765 

etc aac gag gag ccc ctg acg ctg gcg ggc gag gag ate gtg gcc cag 8352 
Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly Glu Glu He Val Ala Gin 
2770 2775 2780 

ggc aag cgc teg gac ccg egg age ctg ctg tgc tat ggc ggc gcc cca 84 00 
Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu Cys Tyr Gly Gly Ala Pro 
2785 2790 2795 2800 

ggg cct ggc tgc cac ttc tec ate ccc gag get ttc age ggg gcc ctg 8448 
Gly Pro Gly Cys His Phe Ser He Pro Glu Ala Phe Ser Gly Ala Leu 
2805 2810 2815 

gcc aac etc agt gac gtg gtg cag etc ate ttt ctg gtg gac tec aat 8496 
Ala Asn Leu Ser Asp Val Val Gin Leu He Phe Leu Val Asp Ser Asn 
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ccc ttt ccc ttt ggc tat ate age aac tac acc gtc tec ace aag gtg 
Pro Phe Pro Phe Gly Tyr lie Ser Asn Tyr Thr Val Ser Thr Lys Val 
2835 2840 2845 

gec teg atg gca ttc cag aca cag gec ggc gec cag ate ccc ate gag 
Ala Ser Met Ala Phe Gin Thr Gin Ala Gly Ala Gin He Pro He Glu 
2850 2855 2860 

egg ctg gec tea gag cgc gee ate acc gtg aag gtg ccc aac aac teg 
Arg Leu Ala Ser Glu Arg Ala He Thr Val Lys Val Pro Asn Asn Ser 
2865 2870 2875 2880 

gac tgg get gec egg ggc cac cgc age tec gec aac tec gec aac tec 
Asp Trp Ala Ala Arg Gly His Arg Ser Ser Ala Asn Ser Ala Asn Ser 
2885 2890 2895 

gtt gtg gtc cag ccc cag gec tec gtc ggt get gtg gtc acc ctg gac 
Val Val Val Gin Pro Gin Ala Ser Val Gly Ala Val Val Thr Leu Asp 
2900 2905 2910 

age age aac cct gcg gec ggg ctg cat ctg cag etc aac tat acg ctg 
Ser Ser Asn Pro Ala Ala Gly Leu His Leu Gin Leu Asn Tyr Thr Leu 
2915 2920 2925 

ctg gac ggc cac tac ctg tct gag gaa cct gag ccc tac ctg gca gtc 
Leu Asp Gly His Tyr Leu Ser Glu Glu Pro Glu Pro Tyr Leu Ala Val 
2930 2935 2940 

tac eta cac teg gag ccc egg ccc aat gag cac aac tgc teg get age 
Tyr Leu His Ser Glu Pro Arg Pro Asn Glu His Asn Cys Ser Ala Ser 
2945 2950 2955 2960 

agg agg ate cgc cca gag tea etc cag ggt get gac cac egg ccc tac 
Arg Arg He Arg Pro Glu Ser Leu Gin Gly Ala Asp His Arg Pro Tyr 
2965 2970 2975 

acc ttc ttc att tec ccg ggg age aga gac cca gcg ggg agt tac cat 
Thr Phe Phe He Ser Pro Gly Ser Arg Asp Pro Ala Gly Ser Tyr His 
2980 2985 2990 

ctg aac etc tec age cac ttc cgc tgg teg gcg ctg cag gtg tec gtg 
Leu Asn Leu Ser Ser His Phe Arg Trp Ser Ala Leu Gin Val Ser Val 
2995 3000 3005 

ggc ctg tac acg tec ctg tgc cag tac ttc age gag gag gac atg gtg 
Gly Leu Tyr Thr Ser Leu Cys Gin Tyr Phe Ser Glu Glu Asp Met Val 
3010 3015 3020 

tgg egg aca gag ggg ctg ctg ccc ctg gag gag acc teg ccc cgc cag 
Trp Arg Thr Glu Gly Leu Leu Pro Leu Glu Glu Thr Ser Pro Arg Gin 
3025 3030 3035 3040 

gec gtc tgc etc acc cgc cac etc acc gee ttc ggc gee age etc ttc 
Ala Val Cys Leu Thr Arg His Leu Thr Ala Phe Gly Ala Ser Leu Phe 
3045 3050 3055 

gtg ccc cca age cat gtc cgc ttt gtg ttt cct gag ccg aca gcg gat 
Val Pro Pro Ser His Val Arg Phe Val Phe Pro Glu Pro Thr Ala Asp 
3060 3065 3070 

gta aac tac ate gtc atg ctg aca tgt get gtg tgc ctg gtg acc tac 
Val Asn Tyr He Val Met Leu Thr Cys Ala Val Cys Leu Val Thr Tyr 
3075 3080 3085 

atg gtc atg gec gec ate ctg cac aag ctg gac cag ttg gat gec age 
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Met Val Met Ala Ala lie Leu His Lys Leu Asp Gin Leu Asp Ala Ser 
3090 3095 3100 

egg ggc cgc gec ate cct ttc tgt ggg cag egg ggc cgc ttc aag tac 
Arg Gly Arg Ala lie Pro Phe Cys Gly Gin Arg Gly Arg Phe Lys Tyr 
3105 3110 3115 3120 

gag ate etc gtc aag aca ggc tgg ggc egg ggc tea ggt ace acg gee 
Glu lie Leu Val Lys Thr Gly Trp Gly Arg Gly Ser Gly Thr Thr Ala 
3125 3130 3135 

cac gtg ggc ate atg ctg tat ggg gtg gac age egg age ggc cac egg 
His Val Gly lie Met Leu Tyr Gly Val Asp Ser Arg Ser Gly His Arg 
3140 3145 3150 

cac ctg gac ggc gac aga gee ttc cac cgc aac age ctg gac ate ttc 
His Leu Asp Gly Asp Arg Ala Phe His Arg Asn Ser Leu Asp He Phe 
3155 3160 3165 

egg ate gee acc ccg cac age ctg ggt age gtg tgg aag ate cga gtg 
Arg He Ala Thr Pro His Ser Leu Gly Ser Val Trp Lys He Arg Val 
3170 3175 3180 

tgg cac gac aac aaa ggg etc age cct gee tgg ttc ctg cag cac gtc 
Trp His Asp Asn Lys Gly Leu Ser Pro Ala Trp Phe Leu Gin His Val 
3185 3190 3195 3200 

ate gtc agg gac ctg cag acg gca cgc age gee ttc ttc ctg gtc aat 
He Val Arg Asp Leu Gin Thr Ala Arg Ser Ala Phe Phe Leu Val Asn 
3205 3210 3215 

gac tgg ctt teg gtg gag acg gag gee aac ggg ggc ctg gtg gag aag 
Asp Trp Leu Ser Val Glu Thr Glu Ala Asn Gly Gly Leu Val Glu Lys 
3220 3225 3230 

gag gtg ctg gee gcg age gac gca gee ctt ttg cgc ttc egg cgc -ctg 
Glu Val Leu Ala Ala Ser Asp Ala Ala Leu Leu Arg Phe Arg Arg Leu 
3235 3240 3245 

ctg gtg get gag ctg cag cgt ggc ttc ttt gac aag cac ate tgg etc 
Leu Val Ala Glu Leu Gin Arg Gly Phe Phe Asp Lys His He Trp Leu 
3250 3255 3260 

tec ata tgg gac egg ccg cct cgt age cgt ttc act cgc ate cag agg 
Ser He Trp Asp Arg Pro Pro Arg Ser Arg Phe Thr Arg He Gin Arg 
3265 3270 3275 3280 

gee acc tgc tgc gtt etc etc ate tgc etc ttc ctg ggc gee aac gec 
Ala Thr Cys Cys Val Leu Leu He Cys Leu Phe Leu Gly Ala Asn Ala 
3285 3290 3295 

gtg tgg tac ggg get gtt ggc gac tct gec tac age acg ggg cat gtg 
Val Trp Tyr Gly Ala Val Gly Asp Ser Ala Tyr Ser Thr Gly His Val 
3300 3305 3310 

tec agg ctg age ccg ctg age gtc gac aca gtc get gtt ggc ctg gtg 
Ser Arg Leu Ser Pro Leu Ser Val Asp Thr Val Ala Val Gly Leu Val 
3315 3320 3325 

tec age gtg gtt gtc tat ccc gtc tac ctg gee ate ctt ttt etc ttc 
Ser Ser Val Val Val Tyr Pro Val Tyr Leu Ala He Leu Phe Leu Phe 
3330 3335 3340 

egg atg tec egg age aag gtg get ggg age ccg age ccc aca cct gec 
Arg Met Ser Arg Ser Lys Val Ala Gly Ser Pro Ser Pro Thr Pro Ala 
3345 3350 3355 3360 
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ggg cag cag gtg ctg gac ate gac age tgc ctg gac teg tec gtg ctg 
Gly Gin Gin Val Leu Asp lie Asp Ser Cys Leu Asp Ser Ser Val Leu 
33S5 3370 3375 

gac age tec ttc etc acg ttc tea ggc etc cac get gag cag gee ttt 
Asp Ser Ser Phe Leu Thr Phe Ser Gly Leu His Ala Glu Gin Ala Phe 
3380 3385 3390 

gtt gga cag atg aag agt gac ttg ttt ctg gat gat tct aag agt ctg 
Val Gly Gin Met Lys Ser Asp Leu Phe Leu Asp Asp Ser Lys Ser Leu 
3395 3400 3405 

gtg tgc tgg ccc tec ggc gag gga acg etc agt tgg ccg gac ctg etc 
Val Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser Trp Pro Asp Leu Leu 
3410 3415 3420 

agt gac ccg tec att gtg ggt age aat ctg egg cag ctg gca egg ggc 
Ser Asp Pro Ser lie Val Gly Ser Asn Leu Arg Gin Leu Ala Arg Gly 
3425 3430 3435 3440 

cag gcg ggc cat ggg ctg ggc cca gag gag gac ggc ttc tec ctg gec 
Gin Ala Gly His Gly Leu Gly Pro Glu Glu Asp Gly Phe Ser Leu Ala 
3445 3450 3455 

age ccc tac teg cct gee aaa tec ttc tea gca tea gat gaa gac ctg 
Ser Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala Ser Asp Glu Asp Leu 
3460 3465 3470 

ate cag cag gtc ctt gee gag ggg gtc age age cca gee cct ace caa 
He Gin Gin Val Leu Ala Glu Gly Val Ser Ser Pro Ala Pro Thr Gin 
3475 3480 3485 

gac acc cac atg gaa acg gac ctg etc age age ctg tec age act cct 
Asp Thr His Met Glu Thr Asp Leu Leu Ser Ser Leu Ser Ser Thr Pro 
3490 3495 3500 

ggg gag aag aca gag acg ctg gcg ctg cag agg ctg ggg gag ctg ggg 
Gly Glu Lys Thr Glu Thr Leu Ala Leu Gin Arg Leu Gly Glu Leu Gly 
3505 3510 3515 3520 

cca ccc age cca ggc ctg aac tgg gaa cag ccc cag gca gcg agg ctg 
Pro Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro Gin Ala Ala Arg Leu 
3525 3530 3535 

tec agg aca gga ctg gtg gag ggt ctg egg aag cgc ctg ctg ccg gee 
Ser Arg Thr Gly Leu Val Glu Gly Leu Arg Lys Arg Leu Leu Pro Ala 
3540 3545 3550 

tgg tgt gee tec ctg gee cac ggg etc age ctg etc ctg gtg get gtg 
Trp Cys Ala Ser Leu Ala His Gly Leu Ser Leu Leu Leu Val Ala Val 
3555 3560 3565 

get gtg get gtc tea ggg tgg gtg ggt gcg age ttc ccc ccg ggc gtg 
Ala Val Ala Val Ser Gly Trp Val Gly Ala Ser Phe Pro Pro Gly Val 
3570 3575 3580 

agt gtt gcg tgg etc ctg tec age age gee age ttc ctg gee tea ttc 
Ser Val Ala Trp Leu Leu Ser Ser Ser Ala Ser Phe Leu Ala Ser Phe 
3585 3590 3595 3600 

etc ggc tgg gag cca ctg aag gtc ttg ctg gaa gec ctg tac ttc tea 
Leu Gly Trp Glu Pro Leu Lys Val Leu Leu Glu Ala Leu Tyr Phe Ser 
3605 3610 3615 

ctg gtg gee aag egg ctg cac ccg gat gaa gat gac acc ctg gta gag 
Leu Val Ala Lys Arg Leu His Pro Asp Glu Asp Asp Thr Leu Val Glu 
3620 3625 3630 
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age ccg get gtg acg cct gtg age gca cgt gtg ccc cgc gta egg cca 10944 
Ser Pro Ala Val Thr Pro Val Ser Ala Arg Val Pro Arg Val Arg Pro 
3635 3640 3645 

ccc cac ggc ttt gca etc ttc ctg gec aag gaa gaa gec cgc aag gtc 10992 
Pro His Gly Phe Ala Leu Phe Leu Ala Lys Glu Glu Ala Arg Lys Val 
3650 3655 3660 

aag agg eta cat ggc atg ctg egg age etc ctg gtg tac atg ctt ttt 1104 0 
Lys Arg Leu His Gly Met Leu Arg Ser Leu Leu Val Tyr Met Leu Phe 
3665 3670 3675 3680 

ctg ctg gtg ace ctg ctg gec age tat ggg gat gee tea tgc cat ggg 11088 
Leu Leu Val Thr Leu Leu Ala Ser Tyr Gly Asp Ala Ser Cys His Gly 
3685 3690 3695 

cac gee tac cgt ctg caa age gec ate aag cag gag ctg cac age egg 1113 6 
His Ala Tyr Arg Leu Gin Ser Ala lie Lys Gin Glu Leu His Ser Arg 
3700 3705 3710 

gee ttc ctg gee ate acg egg tct gag gag etc tgg cca tgg atg gee 11184 
Ala Phe Leu Ala lie Thr Arg Ser Glu Glu Leu Trp Pro Trp Met Ala 
3715 3720 3725 

cac gtg ctg ctg ccc tac gtc cac ggg aac cag tec age cca gag ctg 11232 
His Val Leu Leu Pro Tyr Val His Gly Asn Gin Ser Ser Pro Glu Leu 
3730 3735 3740 

ggg ccc cca egg ctg egg cag gtg egg ctg cag gaa gca etc tac cca 112 8 0 
Gly Pro Pro Arg Leu Arg Gin Val Arg Leu Gin Glu Ala Leu Tyr Pro 
3745 3750 3755 3760 

gac cct ccc ggc ccc agg gtc cac acg tgc teg gec gca gga ggc ttc 1132 8 
Asp Pro Pro Gly Pro Arg Val His Thr Cys Ser Ala Ala Gly Gly Phe 
3765 3770 3775 

age acc age gat tac gac gtt ggc tgg gag agt cct cac aat ggc teg 11376 
Ser Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser Pro His Asn Gly Ser 
3780 3785 3790 

ggg acg tgg gee tat tea gcg ccg gat ctg ctg ggg gca tgg tec tgg 11424 
Gly Thr Trp Ala Tyr Ser Ala Pro Asp Leu Leu Gly Ala Trp Ser Trp 
3795 3800 3805 

ggc tec tgt gec gtg tat gac age ggg ggc tac gtg cag gag ctg ggc 11472 
Gly Ser Cys Ala Val Tyr Asp Ser Gly Gly Tyr Val Gin Glu Leu Gly 
3810 3815 3820 

ctg age ctg gag gag age cgc gac egg ctg cgc ttc ctg cag ctg cac 1152 0 
Leu Ser Leu Glu Glu Ser Arg Asp Arg Leu Arg Phe Leu Gin Leu His 
3825 3830 3835 3840 

aac tgg ctg gac aac agg age cgc get gtg ttc ctg gag etc acg cgc 11568 
Asn Trp Leu Asp Asn Arg Ser Arg Ala Val Phe Leu Glu Leu Thr Arg 
3845 3850 3855 

tac age ccg gec gtg ggg ctg cac gec gee gtc acg ctg cgc etc gag 11616 
Tyr Ser Pro Ala Val Gly Leu His Ala Ala Val Thr Leu Arg Leu Glu 
3860 3865 3870 

ttc ccg gcg gec ggc cgc gec ctg gee gee etc age gtc cgc ccc ttt 11664 
Phe Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu Ser Val Arg Pro Phe 
3875 3880 3885 

gcg ctg cgc cgc etc age gcg ggc etc teg ctg cct ctg etc acc teg 11712 
Ala Leu Arg Arg Leu Ser Ala Gly Leu Ser Leu Pro Leu Leu Thr Ser 
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gtg tgc ctg ctg ctg ttc gcc gtg cac ttc gcc gtg gcc gag gcc cgt 
Val Cys Leu Leu Leu Phe Ala Val His Phe Ala Val Ala Glu Ala Arg 
3905 3910 3915 3920 

act tgg cac agg gaa ggg cgc tgg cgc gtg ctg egg etc gga gcc tgg 
Thr Trp His Arg Glu Gly Arg Trp Arg Val Leu Arg Leu Gly Ala Trp 
3925 3930 3935 

gcg egg tgg ctg ctg gtg gcg ctg acg gcg gcc acg gca ctg gta cgc 
Ala Arg Trp Leu Leu Val Ala Leu Thr Ala Ala Thr Ala Leu Val Arg 
3940 3945 3950 

etc gcc cag ctg ggt gcc get gac cgc cag tgg ace cgt ttc gtg cgc 
Leu Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp Thr Arg Phe Val Arg 
3955 3960 3965 

ggc cgc ccg cgc cgc ttc act age ttc gac cag gtg gcg cac gtg age 
Gly Arg Pro Arg Arg Phe Thr Ser Phe Asp Gin Val Ala His Val Ser 
3970 3975 3980 

tec gca gcc cgt ggc ctg gcg gcc teg ctg etc ttc ctg ctt ttg gtc 
Ser Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu Phe Leu Leu Leu Val 
3985 3990 3995 4000 

aag get gcc cag cac gta cgc ttc gtg cgc cag tgg tec gtc ttt ggc 
Lys Ala Ala Gin His Val Arg Phe Val Arg Gin Trp Ser Val Phe Gly 
4005 4010 4015 

aag aca tta tgc cga get ctg cca gag etc ctg ggg gtc acc ttg ggc 
Lys Thr Leu Cys Arg Ala Leu Pro Glu Leu Leu Gly Val Thr Leu Gly 
4020 4025 4030 

ctg gtg gtg etc ggg gta gcc tac gcc cag ctg gcc ate ctg etc gtg 
Leu Val Val Leu Gly Val Ala Tyr Ala Gin Leu Ala lie Leu Leu Val 
4035 4040 4045 

tct tec tgt gtg gac tec etc tgg age gtg gcc cag gcc ctg ttg gtg 
Ser Ser Cys Val Asp Ser Leu Trp Ser Val Ala Gin Ala Leu Leu Val 
4050 4055 4060 

ctg tgc cct ggg act ggg etc tct acc ctg tgt cct gcc gag tec tgg 
Leu Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys Pro Ala Glu Ser Trp 
4065 4070 4075 4080 

cac ctg tea ccc ctg ctg tgt gtg ggg etc tgg gca ctg egg ctg tgg 
His Leu Ser Pro Leu Leu Cys Val Gly Leu Trp Ala Leu Arg Leu Trp 
4085 4090 4095 

ggc gcc eta egg ctg ggg get gtt att etc cgc tgg cgc tac cac gcc 
Gly Ala Leu Arg Leu Gly Ala Val lie Leu Arg Trp Arg Tyr His Ala 
4100 4105 4110 

ttg cgt gga gag ctg tac egg ccg gcc tgg gag ccc cag gac tac gag 
Leu Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu Pro Gin Asp Tyr Glu 
4115 4120 4125 

atg gtg gag ttg ttc ctg cgc agg ctg cgc etc tgg atg ggc etc age 
Met Val Glu Leu Phe Leu Arg Arg Leu Arg Leu Trp Met Gly Leu Ser 
4130 4135 4140 

aag gtc aag gag ttc cgc cac aaa gtc cgc ttt gaa ggg atg gag ccg 
Lys Val Lys Glu Phe Arg His Lys Val Arg Phe Glu Gly Met Glu Pro 
4145 4150 4155 4160 

ctg ccc tct cgc tec tec agg ggc tec aag gta tec ccg gat gtg ccc 
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Leu Pro Ser Arg Ser Ser Arg Gly Ser Lys Val Se: 
4165 4170 

cca ccc age get ggc tec gat gec teg cac ccc tec acc tec tec age 
Pro Pro Ser Ala Gly Ser Asp Ala Ser His Pro Ser Thr Ser Ser Ser 
4180 4185 4190 

cag ctg gat ggg ctg age gtg age ctg ggc egg ctg ggg aca agg tgt 
Gin Leu Asp Gly Leu Ser Val Ser Leu Gly Arg Leu Gly Thr Arg Cys 
4195 4200 4205 

gag cct gag ccc tec cgc etc caa gee gtg ttc gag gec ctg etc acc 
Glu Pro Glu Pro Ser Arg Leu Gin Ala Val Phe Glu Ala Leu Leu Thr 
4210 4215 4220 

cag ttt gac cga etc aac cag gee aca gag gac gtc tac cag ctg gag 
Gin Phe Asp Arg Leu Asn Gin Ala Thr Glu Asp Val Tyr Gin Leu Glu 
4225 4230 4235 4240 

cag cag ctg cac age ctg caa ggc cgc agg age age egg gcg ccc gec 
Gin Gin Leu His Ser Leu Gin Gly Arg Arg Ser Ser Arg Ala Pro Ala 
4245 4250 4255 

gga tct tec cgt ggc cca tec ccg ggc ctg egg cca gca ctg ccc age 
Gly Ser Ser Arg Gly Pro Ser Pro Gly Leu Arg Pro Ala Leu Pro Ser 
4260 4265 4270 

cgc ctt gec egg gec agt egg ggt gtg gac ctg gee act ggc ccc age 
Arg Leu Ala Arg Ala Ser Arg Gly Val Asp Leu Ala Thr Gly Pro Ser 
4275 4280 4285 

agg aca ccc ctt egg gec aag aac aag gtc cac ccc age age act tag 
Arg Thr Pro Leu Arg Ala Lys Asn Lys Val His Pro Ser Ser Thr 
4290 4295 4300 



<210> 2 
<211> 4303 
<212> PRT 

<213> Homo sapiens PKD-1 protein 
<400> 2 

Met Pro Pro Ala Ala Pro Ala Arg Leu Ala Leu Ala Leu Gly Leu Gly 
15 10 15 

Leu Trp Leu Gly Ala Leu Ala Gly Gly Pro Gly Arg Gly Cys Gly Pro 
20 25 30 

Cys Glu Pro Pro Cys Leu Cys Gly Pro Ala Pro Gly Ala Ala Cys Arg 
35 40 45 

Val Asn Cys Ser Gly Arg Gly Leu Arg Thr Leu Gly Pro Ala Leu Arg 
50 55 60 

lie Pro Ala Asp Ala Thr Glu Leu Asp Val Ser His Asn Leu Leu Arg 
65 70 75 80 

Ala Leu Asp Val Gly Leu Leu Ala Asn Leu Ser Ala Leu Ala Glu Leu 
85 90 95 

Asp lie Ser Asn Asn Lys lie Ser Thr Leu Glu Glu Gly lie Phe Ala 
100 105 110 

Asn Leu Phe Asn Leu Ser Glu lie Asn Leu Ser Gly Asn Pro Phe Glu 
115 120 125 

Cys Asp Cys Gly Leu Ala Trp Leu Pro Gin Trp Ala Glu Glu Gin Gin 
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130 



135 



140 



Val Arg Val Val Gin Pro Glu Ala Ala Thr Cys Ala Gly Pro Gly Ser 
145 150 155 160 

Leu Ala Gly Gin Pro Leu Leu Gly lie Pro Leu Leu Asp Ser Gly Cys 
165 170 175 

Gly Glu Glu Tyr Val Ala Cys Leu Pro Asp Asn Ser Ser Gly Thr Val 
180 185 190 

Ala Ala Val Ser Phe Ser Ala Ala His Glu Gly Leu Leu Gin Pro Glu 
195 200 205 

Ala Cys Ser Ala Phe Cys Phe Ser Thr Gly Gin Gly Leu Ala Ala Leu 
210 215 220 

Ser Glu Gin Gly Trp Cys Leu Cys Gly Ala Ala Gin Pro Ser Ser Ala 
225 230 235 240 

Ser Phe Ala Cys Leu Ser Leu Cys Ser Gly Pro Pro Ala Pro Pro Ala 
245 250 255 

Pro Thr Cys Arg Gly Pro Thr Leu Leu Gin His Val Phe Pro Ala Ser 
260 265 270 

Pro Gly Ala Thr Leu Val Gly Pro His Gly Pro Leu Ala Ser Gly Gin 
275 280 285 

Leu Ala Ala Phe His lie Ala Ala Pro Leu Pro Val Thr Asp Thr Arg 
290 295 300 



Trp Asp Phe Gly Asp Gly Ser Ala Glu Val Asp Ala Ala Gly Pro Ala 
305 310 315 320 

Ala Ser His Arg Tyr Val Leu Pro Gly Arg Tyr His Val Thr Ala Val 
325 330 335 

Leu Ala Leu Gly Ala Gly Ser Ala Leu Leu Gly Thr Asp Val Gin Val 
340 345 350 

Glu Ala Ala Pro Ala Ala Leu Glu Leu Val Cys Pro Ser Ser Val Gin 
355 360 365 

Ser Asp Glu Ser Leu Asp Leu Ser lie Gin Asn Arg Gly Gly Ser Gly 
370 375 380 

Leu Glu Ala Ala Tyr Ser lie Val Ala Leu Gly Glu Glu Pro Ala Arg 
385 390 395 400 

Ala Val His Pro Leu Cys Pro Ser Asp Thr Glu lie Phe Pro Gly Asn 
405 410 415 

Gly His Cys Tyr Arg Leu Val Val Glu Lys Ala Ala Trp Leu Gin Ala 
420 425 430 

Gin Glu Gin Cys Gin Ala Trp Ala Gly Ala Ala Leu Ala Met Val Asp 
435 440 445 

Ser Pro Ala Val Gin Arg Phe Leu Val Ser Arg Val Thr Arg Ser Leu 
450 455 460 

Asp Val Trp He Gly Phe Ser Thr Val Gin Gly Val Glu Val Gly Pro 
465 470 475 480 

Ala Pro Gin Gly Glu Ala Phe Ser Leu Glu Ser Cys Gin Asn Trp Leu 



485 



490 



495 
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Pro Gly Glu Pro His Pro Ala Thr Ala Glu His Cys Val Arg Leu Gly 
500 505 510 

Pro Thr Gly Trp Cys Asn Thr Asp Leu Cys Ser Ala Pro His Ser Tyr 
515 520 525 

Val Cys Glu Leu Gin Pro Gly Gly Pro Val Gin Asp Ala Glu Asn Leu 
530 535 540 

Leu Val Gly Ala Pro Ser Gly Asp Leu Gin Gly Pro Leu Thr Pro Leu 
545 550 555 560 

Ala Gin Gin Asp Gly Leu Ser Ala Pro His Glu Pro Val Glu Val Met 
565 570 575 

Val Phe Pro Gly Leu Arg Leu Ser Arg Glu Ala Phe Leu Thr Thr Ala 
580 585 590 

Glu Phe Gly Thr Gin Glu Leu Arg Arg Pro Ala Gin Leu Arg Leu Gin 
595 600 605 

Val Tyr Arg Leu Leu Ser Thr Ala Gly Thr Pro Glu Asn Gly Ser Glu 
610 615 620 

Pro Glu Ser Arg Ser Pro Asp Asn Arg Thr Gin Leu Ala Pro Ala Cys 
625 630 635 640 

Met Pro Gly Gly Arg Trp Cys Pro Gly Ala Asn lie Cys Leu Pro Leu 
645 650 655 

Asp Ala Ser Cys His Pro Gin Ala Cys Ala Asn Gly Cys Thr Ser Gly 
660 665 670 

Pro Gly Leu Pro Gly Ala Pro Tyr Ala Leu Trp Arg Glu Phe Leu Phe 
675 680 685 

Ser Val Pro Ala Gly Pro Pro Ala Gin Tyr Ser Val Thr Leu His Gly 
690 695 700 

Gin Asp Val Leu Met Leu Pro Gly Asp Leu Val Gly Leu Gin His Asp 
705 710 715 720 

Ala Gly Pro Gly Ala Leu Leu His Cys Ser Pro Ala Pro Gly His Pro 
725 730 735 

Gly Pro Arg Ala Pro Tyr Leu Ser Ala Asn Ala Ser Ser Trp Leu Pro 
740 745 750 

His Leu Pro Ala Gin Leu Glu Gly Thr Trp Gly Cys Pro Ala Cys Ala 
755 760 765 

Leu Arg Leu Leu Ala Gin Arg Glu Gin Leu Thr Val Leu Leu Gly Leu 
770 775 780 

Arg Pro Asn Pro Gly Leu Arg Leu Pro Gly Arg Tyr Glu Val Arg Ala 
785 790 795 800 

Glu Val Gly Asn Gly Val Ser Arg His Asn Leu Ser Cys Ser Phe Asp 
805 810 815 

Val Val Ser Pro Val Ala Gly Leu Arg Val lie Tyr Pro Ala Pro Arg 
820 825 830 

Asp Gly Arg Leu Tyr Val Pro Thr Asn Gly Ser Ala Leu Val Leu Gin 
835 840 845 
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Val Asp Ser Gly Ala Asn Ala Thr Ala Thr Ala Arg Trp Pro Gly Gly 
850 855 860 

Ser Leu Ser Ala Arg Phe Glu Asn Val Cys Pro Ala Leu Val Ala Thr 
865 870 875 880 

Phe Val Pro Ala Cys Pro Trp Glu Thr Asn Asp Thr Leu Phe Ser Val 
885 890 895 

Val Ala Leu Pro Trp Leu Ser Glu Gly Glu His Val Val Asp Val Val 
900 905 910 

Val Glu Asn Ser Ala Ser Arg Ala Asn Leu Ser Leu Arg Val Thr Ala 
915 920 925 

Glu Glu Pro lie Cys Gly Leu Arg Ala Thr Pro Ser Pro Glu Ala Arg 
930 935 940 

Val Leu Gin Gly Val Leu Val Arg Tyr Ser Pro Val Val Glu Ala Gly 
945 950 955 960 

Ser Asp Met Val Phe Arg Trp Thr lie Asn Asp Lys Gin Ser Leu Thr 
965 970 975 

Phe Gin Asn Val Val Phe Asn Val lie Tyr Gin Ser Ala Ala Val Phe 
980 985 990 

Lys Leu Ser Leu Thr Ala Ser Asn His Val Ser Asn Val Thr Val Asn 
995 1000 1005 

Tyr Asn Val Thr Val Glu Arg Met Asn Arg Met Gin Gly Leu Gin Val 
1010 1015 1020 

Ser Thr Val Pro Ala Val Leu Ser Pro Asn Ala Thr Leu Ala Leu Thr 
025 1030 1035 1040 



Phe Gly Asp Gly Glu Gin Ala Leu His Gin Phe Gin Pro Pro Tyr Asn 
1060 1065 1070 

Glu Ser Phe Pro Val Pro Asp Pro Ser Val Ala Gin Val Leu Val Glu 
1075 1080 1085 

His Asn Val Thr His Thr Tyr Ala Ala Pro Gly Glu Tyr Leu Leu Thr 
1090 1095 1100 

Val Leu Ala Ser Asn Ala Phe Glu Asn Leu Thr Gin Gin Val Pro Val 
105 1110 1115 1120 

Ser Val Arg Ala Ser Leu Pro Ser Val Ala Val Gly Val Ser Asp Gly 
1125 1130 1135 

Val Leu Val Ala Gly Arg Pro Val Thr Phe Tyr Pro His Pro Leu Pro 
1140 1145 1150 

Ser Pro Gly Gly Val Leu Tyr Thr Trp Asp Phe Gly Asp Gly Ser Pro 
1155 1160 1165 

Val Leu Thr Gin Ser Gin Pro Ala Ala Asn His Thr Tyr Ala Ser Arg 
1170 1175 1180 

Gly Thr Tyr His Val Arg Leu Glu Val Asn Asn Thr Val Ser Gly Ala 
185 1190 1195 1200 

Ala Ala Gin Ala Asp Val Arg Val Phe Glu Glu Leu Arg Gly Leu Ser 
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Val Asp Met Ser Leu Ala Val Glu Gin Gly Ala Pro Val Val Val Ser 
1220 1225 1230 

Ala Ala Val Gin Thr Gly Asp Asn lie Thr Trp Thr Phe Asp Met Gly 
1235 1240 1245 

Asp Gly Thr Val Leu Ser Gly Pro Glu Ala Thr Val Glu His Val Tyr 
1250 1255 1260 

Leu Arg Ala Gin Asn Cys Thr Val Thr Val Gly Ala Gly Ser Pro Ala 
265 1270 1275 1280 

Gly His Leu Ala Arg Ser Leu His Val Leu Val Phe Val Leu Glu Val 
1285 1290 1295 

Leu Arg Val Glu Pro Ala Ala Cys lie Pro Thr Gin Pro Asp Ala Arg 
1300 1305 1310 

Leu Thr Ala Tyr Val Thr Gly Asn Pro Ala His Tyr Leu Phe Asp Trp 
1315 1320 1325 

Thr Phe Gly Asp Gly Ser Ser Asn Thr Thr Val Arg Gly Cys Pro Thr 
1330 1335 1340 

Val Thr His Asn Phe Thr Arg Ser Gly Thr Phe Pro Leu Ala Leu Val 
345 1350 1355 1360 

Leu Ser Ser Arg Val Asn Arg Ala His Tyr Phe Thr Ser lie Cys Val 
1365 1370 1375 

Glu Pro Glu Val Gly Asn Val Thr Leu Gin Pro Glu Arg Gin Phe Val 
1380 1385 1390 

Gin Leu Gly Asp Glu Ala Trp Leu Val Ala Cys Ala Trp Pro Pro Phe 
1395 1400 1405 

Pro Tyr Arg Tyr Thr Trp Asp Phe Gly Thr Glu Glu Ala Ala Pro Thr 
1410 1415 1420 

Arg Ala Arg Gly Pro Glu Val Thr Phe lie Tyr Arg Asp Pro Gly Ser 
425 1430 1435 1440 

Tyr Leu Val Thr Val Thr Ala Ser Asn Asn lie Ser Ala Ala Asn Asp 
1445 1450 1455 

Ser Ala Leu Val Glu Val Gin Glu Pro Val Leu Val Thr Ser lie Lys 
1460 1465 1470 

Val Asn Gly Ser Leu Gly Leu Glu Leu Gin Gin Pro Tyr Leu Phe Ser 
1475 1480 1485 

Ala Val Gly Arg Gly Arg Pro Ala Ser Tyr Leu Trp Asp Leu Gly Asp 
1490 1495 1500 

Gly Gly Trp Leu Glu Gly Pro Glu Val Thr His Ala Tyr Asn Ser Thr 
505 1510 1515 1520 

Gly Asp Phe Thr Val Arg Val Ala Gly Trp Asn Glu Val Ser Arg Ser 
1525 1530 1535 

Glu Ala Trp Leu Asn Val Thr Val Lys Arg Arg Val Arg Gly Leu Val 
1540 1545 1550 

Val Asn Ala Ser Arg Thr Val Val Pro Leu Asn Gly Ser Val Ser Phe 
1555 1560 1565 
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Ser Thr Ser Leu Glu Ala Gly Ser Asp Val Arg Tyr Ser Trp Val Leu 
1570 1575 1580 

Cys Asp Arg Cys Thr Pro lie Pro Gly Gly Pro Thr lie Ser Tyr Thr 
585 1590 1595 1600 

Phe Arg Ser Val Gly Thr Phe Asn lie lie Val Thr Ala Glu Asn Glu 
1605 1610 1615 

Val Gly Ser Ala Gin Asp Ser lie Phe Val Tyr Val Leu Gin Leu lie 
1620 1625 1630 

Glu Gly Leu Gin Val Val Gly Gly Gly Arg Tyr Phe Pro Thr Asn His 
1635 1640 1645 

Thr Val Gin Leu Gin Ala Val Val Arg Asp Gly Thr Asn Val Ser Tyr 
1650 1655 1660 

Ser Trp Thr Ala Trp Arg Asp Arg Gly Pro Ala Leu Ala Gly Ser Gly 
665 1670 1675 1680 

Lys Gly Phe Ser Leu Thr Val Leu Glu Ala Gly Thr Tyr His Val Gin 
1685 1690 1695 

Leu Arg Ala Thr Asn Met Leu Gly Ser Ala Trp Ala Asp Cys Thr Met 
1700 1705 1710 

Asp Phe Val Glu Pro Val Gly Trp Leu Met Val Ala Ala Ser Pro Asn 
1715 1720 1725 

Pro Ala Ala Val Asn Thr Ser Val Thr Leu Ser Ala Glu Leu Ala Gly 
1730 1735 1740 

Gly Ser Gly Val Val Tyr Thr Trp Ser Leu Glu Glu Gly Leu Ser Trp 
745 1750 1755 1760 

Glu Thr Ser Glu Pro Phe Thr Thr His Ser Phe Pro Thr Pro Gly Leu 
1765 1770 1775 

His Leu Val Thr Met Thr Ala Gly Asn Pro Leu Gly Ser Ala Asn Ala 
1780 1785 1790 

Thr Val Glu Val Asp Val Gin Val Pro Val Ser Gly Leu Ser lie Arg 
1795 1800 1805 

Ala Ser Glu Pro Gly Gly Ser Phe Val Ala Ala Gly Ser Ser Val Pro 
1810 1815 1820 

Phe Trp Gly Gin Leu Ala Thr Gly Thr Asn Val Ser Trp Cys Trp Ala 
825 1830 1835 1840 

Val Pro Gly Gly Ser Ser Lys Arg Gly Pro His Val Thr Met Val Phe 
1845 1850 1855 

Pro Asp Ala Gly Thr Phe Ser lie Arg Leu Asn Ala Ser Asn Ala Val 
1860 1865 1870 

Ser Trp Val Ser Ala Thr Tyr Asn Leu Thr Ala Glu Glu Pro lie Val 
1875 1880 1885 

Gly Leu Val Leu Trp Ala Ser Ser Lys Val Val Ala Pro Gly Gin Leu 
1890 1895 1900 

Val His Phe Gin He Leu Leu Ala Ala Gly Ser Ala Val Thr Phe Arg 
1905 1910 1915 1920 
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# 



Leu Gin Val Gly Gly Ala Asn Pro Glu Val Leu Pro Gly Pro Arg Phe 
1925 1930 1935 

Ser His Ser Phe Pro Arg Val Gly Asp His Val Val Ser Val Arg Gly 
1940 1945 1950 

Lys Asn His Val Ser Trp Ala Gin Ala Gin Val Arg lie Val Val Leu 
1955 1960 1965 

Glu Ala Val Ser Gly Leu Gin Val Pro Asn Cys Cys Glu Pro Gly lie 
1970 1975 1980 

Ala Thr Gly Thr Glu Arg Asn Phe Thr Ala Arg Val Gin Arg Gly Ser 
985 1990 1995 2000 

Arg Val Ala Tyr Ala Trp Tyr Phe Ser Leu Gin Lys Val Gin Gly Asp 
2005 2010 2015 

Ser Leu Val lie Leu Ser Gly Arg Asp Val Thr Tyr Thr Pro Val Ala 
2020 2025 2030 

Ala Gly Leu Leu Glu lie Gin Val Arg Ala Phe Asn Ala Leu Gly Ser 
2035 2040 2045 

Glu Asn Arg Thr Leu Val Leu Glu Val Gin Asp Ala Val Gin Tyr Val 
2050 2055 2060 

Ala Leu Gin Ser Gly Pro Cys Phe Thr Asn Arg Ser Ala Gin Phe Glu 
065 2070 2075 2080 

Ala Ala Thr Ser Pro Ser Pro Arg Arg Val Ala Tyr His Trp Asp Phe 
2085 2090 2095 

Gly Asp Gly Ser Pro Gly Gin Asp Thr Asp Glu Pro Arg Ala Glu His 
2100 2105 2110 

Ser Tyr Leu Arg Pro Gly Asp Tyr Arg Val Gin Val Asn Ala Ser Asn 
2115 2120 2125 

Leu Val Ser Phe Phe Val Ala Gin Ala Thr Val Thr Val Gin Val Leu 
2130 2135 2140 

Ala Cys Arg Glu Pro Glu Val Asp Val Val Leu Pro Leu Gin Val Leu 
145 2150 2155 2160 

Met Arg Arg Ser Gin Arg Asn Tyr Leu Glu Ala His Val Asp Leu Arg 
2165 2170 2175 

Asp Cys Val Thr Tyr Gin Thr Glu Tyr Arg Trp Glu Val Tyr Arg Thr 
2180 2185 2190 

Ala Ser Cys Gin Arg Pro Gly Arg Pro Ala Arg Val Ala Leu Pro Gly 
2195 2200 2205 

Val Asp Val Ser Arg Pro Arg Leu Val Leu Pro Arg Leu Ala Leu Pro 
2210 2215 2220 

Val Gly His Tyr Cys Phe Val Phe Val Val Ser Phe Gly Asp Thr Pro 
225 2230 2235 2240 

Leu Thr Gin Ser lie Gin Ala Asn Val Thr Val Ala Pro Glu Arg Leu 
2245 2250 2255 

Val Pro lie lie Glu Gly Gly Ser Tyr Arg Val Trp Ser Asp Thr Arg 
2260 2265 2270 

Asp Leu Val Leu Asp Gly Ser Glu Ser Tyr Asp Pro Asn Leu Glu Asp 
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Gly Asp Gin Thr Pro Leu Ser Phe His Trp Ala Cys Val Ala Ser Thr 
2290 2295 2300 

Gin Arg Glu Ala Gly Gly Cys Ala Leu Asn Phe Gly Pro Arg Gly Ser 
305 2310 2315 2320 

Ser Thr Val Thr lie Pro Arg Glu Arg Leu Ala Ala Gly Val Glu Tyr 
2325 2330 2335 

Thr Phe Ser Leu Thr Val Trp Lys Ala Gly Arg Lys Glu Glu Ala Thr 
2340 ' 2345 2350 

Asn Gin Thr Val Leu lie Arg Ser Gly Arg Val Pro lie Val Ser Leu 
2355 2360 2365 

Glu Cys Val Ser Cys Lys Ala Gin Ala Val Tyr Glu Val Ser Arg Ser 
2370 2375 2380 

Ser Tyr Val Tyr Leu Glu Gly Arg Cys Leu Asn Cys Ser Ser Gly Ser 
385 2390 2395 2400 

Lys Arg Gly Arg Trp Ala Ala Arg Thr Phe Ser Asn Lys Thr Leu Val 
2405 2410 2415 

Leu Asp Glu Thr Thr Thr Ser Thr Gly Ser Ala Gly Met Arg Leu Val 
2420 2425 2430 

Leu Arg Arg Gly Val Leu Arg Asp Gly Glu Gly Tyr Thr Phe Thr Leu 
2435 2440 2445 

Thr Val Leu Gly Arg Ser Gly Glu Glu Glu Gly Cys Ala Ser He Arg 
2450 2455 2460 

Leu Ser Pro Asn Arg Pro Pro Leu Gly Gly Ser Cys Arg Leu Phe Pro 
465 2470 2475 2480 

Leu Gly Ala Val His Ala Leu Thr Thr Lys Val His Phe Glu Cys Thr 
2485 2490 2495 

Gly Trp His Asp Ala Glu Asp Ala Gly Ala Pro Leu Val Tyr Ala Leu 
2500 2505 2510 

Leu Leu Arg Arg Cys Arg Gin Gly His Cys Glu Glu Phe Cys Val Tyr 
2515 2520 2525 

Lys Gly Ser Leu Ser Ser Tyr Gly Ala Val Leu Pro Pro Gly Phe Arg 
2530 2535 2540 

Pro His Phe Glu Val Gly Leu Ala Val Val Val Gin Asp Gin Leu Gly 
545 2550 2555 2560 

Ala Ala Val Val Ala Leu Asn Arg Ser Leu Ala He Thr Leu Pro Glu 
2565 2570 2575 

Pro Asn Gly Ser Ala Thr Gly Leu Thr Val Trp Leu His Gly Leu Thr 
2580 2585 2590 

Ala Ser Val Leu Pro Gly Leu Leu Arg Gin Ala Asp Pro Gin His Val 
2595 2600 2605 

He Glu Tyr Ser Leu Ala Leu Val Thr Val Leu Asn Glu Tyr Glu Arg 
2610 2615 2620 

Ala Leu Asp Val Ala Ala Glu Pro Lys His Glu Arg Gin His Arg Ala 
625 2630 2635 2640 
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Gin lie Arg Lys Asn lie Thr Glu Thr Leu Val Ser Leu Arg Val His 
2645 2650 2655 

Thr Val Asp Asp lie Gin Gin lie Ala Ala Ala Leu Ala Gin Cys Met 
2660 2665 2670 

Gly Pro Ser Arg Glu Leu Val Cys Arg Ser Cys Leu Lys Gin Thr Leu 
2675 2680 2685 

His Lys Leu Glu Ala Met Met Leu lie Leu Gin Ala Glu Thr Thr Ala 
2690 2695 2700 

Gly Thr Val Thr Pro Thr Ala lie Gly Asp Ser lie Leu Asn lie Thr 
705 2710 2715 2720 

Gly Asp Leu lie His Leu Ala Ser Ser Asp Val Arg Ala Pro Gin Pro 
2725 2730 2735 

Ser Glu Leu Gly Ala Glu Ser Pro Ser Arg Met Val Ala Ser Gin Ala 
2740 2745 2750 

Tyr Asn Leu Thr Ser Ala Leu Met Arg lie Leu Met Arg Ser Arg Val 
2755 2760 2765 

Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly Glu Glu lie Val Ala Gin 
2770 2775 2780 

Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu Cys Tyr Gly Gly Ala Pro 
785 2790 2795 2800 

Gly Pro Gly Cys His Phe Ser lie Pro Glu Ala Phe Ser Gly Ala Leu 
2805 2810 2815 

Ala Asn Leu Ser Asp Val Val Gin Leu lie Phe Leu Val Asp Ser Asn 
2820 2825 2830 

Pro Phe Pro Phe Gly Tyr lie Ser Asn Tyr Thr Val Ser Thr Lys Val 
2835 2840 2845 

Ala Ser Met Ala Phe Gin Thr Gin Ala Gly Ala Gin lie Pro lie Glu 
2850 2855 2860 

Arg Leu Ala Ser Glu Arg Ala lie Thr Val Lys Val Pro Asn Asn Ser 
865 2870 2875 2880 

Asp Trp Ala Ala Arg Gly His Arg Ser Ser Ala Asn Ser Ala Asn Ser 
2885 2890 2895 

Val Val Val Gin Pro Gin Ala Ser Val Gly Ala Val Val Thr Leu Asp 
2900 2905 2910 

Ser Ser Asn Pro Ala Ala Gly Leu His Leu Gin Leu Asn Tyr Thr Leu 
2915 2920 2925 

Leu Asp Gly His Tyr Leu Ser Glu Glu Pro Glu Pro Tyr Leu Ala Val 
2930 2935 2940 

Tyr Leu His Ser Glu Pro Arg Pro Asn Glu His Asn Cys Ser Ala Ser 
945 2950 2955 2960 

Arg Arg lie Arg Pro Glu Ser Leu Gin Gly Ala Asp His Arg Pro Tyr 
2965 2970 2975 

Thr Phe Phe lie Ser Pro Gly Ser Arg Asp Pro Ala Gly Ser Tyr His 
2980 2985 2990 
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Leu Asn Leu Ser Ser His Phe Arg Trp Ser Ala Leu Gin Val Ser Val 
2995 3000 3005 

Gly Leu Tyr Thr Ser Leu Cys Gin Tyr Phe Ser Glu Glu Asp Met Val 
3010 3015 3020 

Trp Arg Thr Glu Gly Leu Leu Pro Leu Glu Glu Thr Ser Pro Arg Gin 
025 3030 3035 3040 

Ala Val Cys Leu Thr Arg His Leu Thr Ala Phe Gly Ala Ser Leu Phe 
3045 3050 3055 

Val Pro Pro Ser His Val Arg Phe Val Phe Pro Glu Pro Thr Ala Asp 
3060 3065 3070 

Val Asn Tyr lie Val Met Leu Thr Cys Ala Val Cys Leu Val Thr Tyr 
3075 3080 3085 

Met Val Met Ala Ala lie Leu His Lys Leu Asp Gin Leu Asp Ala Ser 
3090 3095 3100 

Arg Gly Arg Ala lie Pro Phe Cys Gly Gin Arg Gly Arg Phe Lys Tyr 
105 3110 3115 3120 

Glu lie Leu Val Lys Thr Gly Trp Gly Arg Gly Ser Gly Thr Thr Ala 
3125 3130 3135 

His Val Gly lie Met Leu Tyr Gly Val Asp Ser Arg Ser Gly His Arg 
3140 3145 3150 

His Leu Asp Gly Asp Arg Ala Phe His Arg Asn Ser Leu Asp lie Phe 
3155 3160 3165 

Arg lie Ala Thr Pro His Ser Leu Gly Ser Val Trp Lys lie Arg Val 
3170 3175 3180 

Trp His Asp Asn Lys Gly Leu Ser Pro Ala Trp Phe Leu Gin His Val 
185 3190 3195 3200 

lie Val Arg Asp Leu Gin Thr Ala Arg Ser Ala Phe Phe Leu Val Asn 
3205 3210 3215 

Asp Trp Leu Ser Val Glu Thr Glu Ala Asn Gly Gly Leu Val Glu Lys 
3220 3225 3230 

Glu Val Leu Ala Ala Ser Asp Ala Ala Leu Leu Arg Phe Arg Arg Leu 
3235 3240 3245 

Leu Val Ala Glu Leu Gin Arg Gly Phe Phe Asp Lys His lie Trp Leu 
3250 3255 3260 

Ser lie Trp Asp Arg Pro Pro Arg Ser Arg Phe Thr Arg lie Gin Arg 
265 3270 3275 3280 

Ala Thr Cys Cys Val Leu Leu lie Cys Leu Phe Leu Gly Ala Asn Ala 
3285 3290 3295 

Val Trp Tyr Gly Ala Val Gly Asp Ser Ala Tyr Ser Thr Gly His Val 
3300 3305 3310 

Ser Arg Leu Ser Pro Leu Ser Val Asp Thr Val Ala Val Gly Leu Val 
3315 3320 3325 

Ser Ser Val Val Val Tyr Pro Val Tyr Leu Ala lie Leu Phe Leu Phe 
3330 3335 3340 

Arg Met Ser Arg Ser Lys Val Ala Gly Ser Pro Ser Pro Thr Pro Ala 
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Gly Gin Gin Val Leu Asp lie Asp Ser Cys Leu Asp Ser Ser Val Leu 
3365 3370 3375 

Asp Ser Ser Phe Leu Thr Phe Ser Gly Leu His Ala Glu Gin Ala Phe 
3380 3385 3390 

Val Gly Gin Met Lys Ser Asp Leu Phe Leu Asp Asp Ser Lys Ser Leu 
3395 3400 3405 

Val Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser Trp Pro Asp Leu Leu 
3410 3415 3420 

Ser Asp Pro Ser lie Val Gly Ser Asn Leu Arg Gin Leu Ala Arg Gly 
425 3430 3435 3440 

Gin Ala Gly His Gly Leu Gly Pro Glu Glu Asp Gly Phe Ser Leu Ala 
3445 3450 3455 

Ser Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala Ser Asp Glu Asp Leu 
3460 3465 3470 

lie Gin Gin Val Leu Ala Glu Gly Val Ser Ser Pro Ala Pro Thr Gin 
3475 3480 3485 

Asp Thr His Met Glu Thr Asp Leu Leu Ser Ser Leu Ser Ser Thr Pro 
3490 3495 3500 

Gly Glu Lys Thr Glu Thr Leu Ala Leu Gin Arg Leu Gly Glu Leu Gly 
505 3510 3515 3520 

Pro Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro Gin Ala Ala Arg Leu 
3525 3530 3535 

Ser Arg Thr Gly Leu Val Glu Gly Leu Arg Lys Arg Leu Leu Pro Ala 
3540 3545 3550 

Trp Cys Ala Ser Leu Ala His Gly Leu Ser Leu Leu Leu Val Ala Val 
3555 3560 3565 

Ala Val Ala Val Ser Gly Trp Val Gly Ala Ser Phe Pro Pro Gly Val 
3570 3575 3580 

Ser Val Ala Trp Leu Leu Ser Ser Ser Ala Ser Phe Leu Ala Ser Phe 
585 3590 3595 3600 

Leu Gly Trp Glu Pro Leu Lys Val Leu Leu Glu Ala Leu Tyr Phe Ser 
3605 3610 3615 

Leu Val Ala Lys Arg Leu His Pro Asp Glu Asp Asp Thr Leu Val Glu 
3620 3625 3630 

Ser Pro Ala Val Thr Pro Val Ser Ala Arg Val Pro Arg Val Arg Pro 
3635 3640 3645 

Pro His Gly Phe Ala Leu Phe Leu Ala Lys Glu Glu Ala Arg Lys Val 
3650 3655 3660 

Lys Arg Leu His Gly Met Leu Arg Ser Leu Leu Val Tyr Met Leu Phe 
665 3670 3675 3680 

Leu Leu Val Thr Leu Leu Ala Ser Tyr Gly Asp Ala Ser Cys His Gly 
3685 3690 3695 

His Ala Tyr Arg Leu Gin Ser Ala lie Lys Gin Glu Leu His Ser Arg 
3700 3705 3710 
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Ala Phe Leu Ala lie Thr Arg Ser Glu Glu Leu Trp Pro Trp Met Ala 
3715 3720 3725 

His Val Leu Leu Pro Tyr Val His Gly Asn Gin Ser Ser Pro Glu Leu 
3730 3735 3740 

Gly Pro Pro Arg Leu Arg Gin Val Arg Leu Gin Glu Ala Leu Tyr Pro 
745 3750 3755 3760 

Asp Pro Pro Gly Pro Arg Val His Thr Cys Ser Ala Ala Gly Gly Phe 
3765 3770 3775 

Ser Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser Pro His Asn Gly Ser 
3780 3785 3790 

Gly Thr Trp Ala Tyr Ser Ala Pro Asp Leu Leu Gly Ala Trp Ser Trp 
3795 3800 3805 

Gly Ser Cys Ala Val Tyr Asp Ser Gly Gly Tyr Val Gin Glu Leu Gly 
3810 3815 3820 

Leu Ser Leu Glu Glu Ser Arg Asp Arg Leu Arg Phe Leu Gin Leu His 
825 3830 3835 3840 

Asn Trp Leu Asp Asn Arg Ser Arg Ala Val Phe Leu Glu Leu Thr Arg 
3845 3850 3855 

Tyr Ser Pro Ala Val Gly Leu His Ala Ala Val Thr Leu Arg Leu Glu 
3860 3865 3870 

Phe Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu Ser Val Arg Pro Phe 
3875 3880 3885 

Ala Leu Arg Arg Leu Ser Ala Gly Leu Ser Leu Pro Leu Leu Thr Ser 
3890 3895 3900 

Val Cys Leu Leu Leu Phe Ala Val His Phe Ala Val Ala Glu Ala Arg 
905 3910 3915 3920 

Thr Trp His Arg Glu Gly Arg Trp Arg Val Leu Arg Leu Gly Ala Trp 
3925 3930 3935 

Ala Arg Trp Leu Leu Val Ala Leu Thr Ala Ala Thr Ala Leu Val Arg 
3940 3945 3950 

Leu Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp Thr Arg Phe Val Arg 
3955 3960 3965 

Gly Arg Pro Arg Arg Phe Thr Ser Phe Asp Gin Val Ala His Val Ser 
3970 3975 3980 

Ser Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu Phe Leu Leu Leu Val 
985 3990 3995 4000 

Lys Ala Ala Gin His Val Arg Phe Val Arg Gin Trp Ser Val Phe Gly 
4005 4010 4015 

Lys Thr Leu Cys Arg Ala Leu Pro Glu Leu Leu Gly Val Thr Leu Gly 
4020 4025 4030 

Leu Val Val Leu Gly Val Ala Tyr Ala Gin Leu Ala lie Leu Leu Val 
4035 4040 4045 

Ser Ser Cys Val Asp Ser Leu Trp Ser Val Ala Gin Ala Leu Leu Val 
4050 4055 4060 
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Leu Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys Pro Ala Glu Ser Trp 
065 4070 4075 4080 

His Leu Ser Pro Leu Leu Cys Val Gly Leu Trp Ala Leu Arg Leu Trp 
4085 4090 4095 

Gly Ala Leu Arg Leu Gly Ala Val lie Leu Arg Trp Arg Tyr His Ala 
4100 4105 4110 

Leu Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu Pro Gin Asp Tyr Glu 
4115 4120 4125 

Met Val Glu Leu Phe Leu Arg Arg Leu Arg Leu Trp Met Gly Leu Ser 
4130 4135 4140 

Lys Val Lys Glu Phe Arg His Lys Val Arg Phe Glu Gly Met Glu Pro 
145 4150 4155 4160 

Leu Pro Ser Arg Ser Ser Arg Gly Ser Lys Val Ser Pro Asp Val Pro 
4155 4170 4175 

Pro Pro Ser Ala Gly Ser Asp Ala Ser His Pro Ser Thr Ser Ser Ser 
4180 4185 4190 

Gin Leu Asp Gly Leu Ser Val Ser Leu Gly Arg Leu Gly Thr Arg Cys 
4195 4200 4205 

Glu Pro Glu Pro Ser Arg Leu Gin Ala Val Phe Glu Ala Leu Leu Thr 
4210 4215 4220 

Gin Phe Asp Arg Leu Asn Gin Ala Thr Glu Asp Val Tyr Gin Leu Glu 
225 4230 4235 4240 

Gin Gin Leu His Ser Leu Gin Gly Arg Arg Ser Ser Arg Ala Pro Ala 
4245 4250 4255 

Gly Ser Ser Arg Gly Pro Ser Pro Gly Leu Arg Pro Ala Leu Pro Ser 
4260 4265 4270 

Arg Leu Ala Arg Ala Ser Arg Gly Val Asp Leu Ala Thr Gly Pro Ser 
4275 4280 4285 

Arg Thr Pro Leu Arg Ala Lys Asn Lys Val His Pro Ser Ser Thr 
4290 4295 4300 



<210> 3 
<211> 12685 
<212> DNA 

<213> C. Elegans lov-l gene 
<400> 3 

tcaatctttc tccacatcgt ttagccgcca cttctggaat ctctttggtc cagtttcgtg 60 
aatagcagag acaggatcat aggagagtgt gtagttgatg actgtttggt tttggtattg 12 0 
accttgagtt tggagcattc tggtggcacg atgatgaagc agattgactt tggcaacagc 180 
gctgtggaat agacggaagt ctttttgagt gtcagcaatt gaaactggag caaaatcttt 24 0 
tggttcaaga agacccaagc gacgttttgt ctgaaattaa ataacagaaa ttaaagaaca 300 
tctaatagtg agcttgaaaa ataaatacct tgtattttat gtgatcgatt atttcgtaat 36 0 
cattggtctg cttctcactg tcattacgaa tttcctcgaa ctcgaacata attatagtga 42 0 
cgtaaagttg caggacgagc tttgatccgg caatcatata aagcatgatc acaacaaacg 480 
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caaattgaga aatcggttga atagaggtaa 
ctgtttgaaa ggtagccatt aagctccgat 
tttcaattaa gttttcatcc tcaccctccc 
gagttgaatg tcatgctgaa gaacaggaaa 
gcgttatcca gtgtagccgc taatactcca 
attttacaag aagtgaagaa tacggctccg 
tttctctgtt cagtcaaatt aatgtacgaa 
gctctatttg tggttcgttg gcggatggtg 
ccaactatca agtccatgaa gttccatggt 
ctgaagttct cattagaccc accccagtgc 
cgttggtttc attgttgtta tcaccttgta 
tacaaatatg agaactgaga aaaagatgta 
atcgcttccc tctgatttga taagtcttac 
tcctgacttt ggaatctcca ccaacaactg 
gtatgcagag aactcaatga tgactgctcg 
aagtttattg aagagagtga tgatttccgc 
acctcctgaa tagctataca gtaggcctga 
cttgtaggtg tattcatctg aagcatctgt 
agcttgcatg tacaaagttt tttcttcgtt 
agatttttac tcacttgctt gtcaactcct 
gcttgaacat cgtacactct gcacttttct 
ccatacttct tgaaacttta tcattcatgt 
cgtaccaaga agccaaaaga gcagtggcca 
cagcatgttg gattgacatg aaagtattgt 
ttgtgctcat ctgaaaataa taagttcatc 
actgatacca atatccatgc cggtctttgc 
aagaagcaaa gaaacaaagc atatcacgaa 
ttcggttttc agtgtctcgg agctttgtaa 
cgattttccc atagggattt cctgaatttc 
aaacttactc aagaacgtct cagctggctt 
aatgatttta tgactctttc tggttttcaa 
tggctcgaac accactgccc atagaatcag 
ggctaaatca tccatcaagc tcattccagc 




catcaagttt tccaagcatt ccagccaatg 540 
atctggaacc aatttttaaa aattgatttc 600 
attttatttc ctaaaactgc gtacaataca 660 
gcaattccaa atgacacaat agctccgaga 72 0 
attcttctgt tgaatctcaa gattcgaatc 780 
gcaagacaat aactgaatac aatctcccaa 84 0 
tttccattgt ttgcattgaa atcttccatt 900 
taggctagga ccgatgcaac agcgagagct 96 0 
gagaagtttc tataaaatgt ctttttgaaa 102 0 
cagctgatac acaattttga atggatttct 1080 
ccgcccatac aagtagaaca caatctcttt 114 0 
aagcatctca taatacttga ccacagttcc 1200 
tgattcaacc caactattag gaagataaat 1260 
taccaccgaa aagtagttga tttgagcatt 132 0 
agtatgatca tcgatccatc gttccgaatc 1380 
ttgggtacca gacatactga tagtatatcc 144 0 
aactgtttca gtggataatt cttcagaagt 150 0 
tccattctcg gattccagtt cggtccaacc 1560 
tctgaaacct ctctattagt tggaaattga 1620 
ctccacaatc attgatgtat ccttgaaact 1680 
ttgtccgaac ctgccgtatc gtacctattc 1740 
aggctctcat cccgtatgca ggatttccgt 1800 
gagattcacg agcccaatcc cagaaatcgt 1860 
caccgtagtt cttttgattg atgttcaaga 1920 
taaatctatg tgcattaaag tctacctcca 1980 
aatagtatgt cagcataacc ataatataca 2040 
tggttataaa taactgttca tctctcattt 2100 
catcagcaat ttcggttccc agaccttttt 2160 
agtaatgaat tctgatagct tctttttata 22 2 0 
agctcttagc aatgcttcct ccaacttgtt 2280 
aattaaaaac gcccaaatca atcccttaat 2340 
actgatcaga aatcggatat agaaagagtt 24 0 0 
tccagaaata taaataagac ccatgagaac 2460 
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tggaaatact atgatggtac gtgccatccc 
atccttgaat tccggatcct ctctctttcg 
acgacatttg gtgcataata aaatgtgcaa 
tccaactccg aatgcaatat cttttatagt 
actgataatc gaattatcgc tcttcagaat 
tgagaagatg atactgacag aatagtcttg 
tccaccagta aacatggcaa accaggaaat 
ctcatccaaa aaccttcgct tatactccac 
gtttttagtt ccaagccaat tgttgaaagg 
tttcacaatt attcgattgc aataccacga 
ccagagtctc atgtattcca actcgccaag 
ttttgatttg aaaacttgaa ttagtccatc 
aaactacacg tccaatctat aattagctca 
aatttgcata attggttgaa acgtgtgtgg 
tttttattgc taaaaatcag cgtcttctaa 
cagtggttcc ccatgaaaac ggaaactccc 
ggaaaatctg atccccttca tttccagata 
caaacattcg atatccagtc tccacggcaa 
tgtctttcag aaaacgaagt ctcccgcgtg 
tgatggtaag acatccgtaa actactagca 
tcttttcgat ttcattcaca ttataattgt 
atgcaccaac agagaacatt gttaaatgat 
gtccatcact tggatacatt ccttcagaat 
aacatccttt actcactgca gcgacttgat 
actgcattga atcgtactga ccatagttta 
agctatttct ttttccaatt ccaataaaga 
cggtgacaaa ataattgctt gtcttgttca 
ttgattcaag tgggccagga agactttgga 
ttggaatttc atagtcttga gatgcaataa 
tggttcgaaa agcatgaaga tctaatatct 
aagttagcac tgcatcatcc tcacttcccc 
cggttattct aaaaacattt taacttatat 
acggaatgat tatctgattc tcatctttga 
acatatcaaa gttatccaca taagttctcg 



agccatgaac atcggccatg aaccactatt 2520 
ttttttgtag tagtaatgtt cactgtggga 2 580 
tgagttgagg aaagtgataa gaacaccgaa 2640 
gaaagtgaac tcggagacac tcttcgaatc 2700 
tgtgatgcta atcatgctga ccacaacaag 27 60 
ccttgacact cgatccctca accgattgcc 2820 
tgtttgagcc agcatatgca tactcattga 2880 
tcgcgctagt ctttcagtct ctccgtctcc 2940 
gaagtagtag atatcctgag tctgtagatc 300 0 
ctctcggtga tctagaccag catcgtcaag 3060 
agggctgaaa tattaaattt ggtaaatgat 3120 
aaaaaccaaa acaagttagg gggataaaaa 3180 
actcacactt gaaccaatca gcgttgtccg 3240 
agctaatcgg tattatttat tttaattatt 3300 
cttacaatgc cgttgtcatc acaaatcgat 3360 
aattaccatc ttcttctgat ctgaacgatc 3420 
aattgaaaca tatcgtacta tccgtagttg 348 0 
tcacatacat gtatccatca tgaggctcat 3540 
atgcatcttt acgttgacag atgattgcat 3600 
tgaaaacagc ggcaatcatc actttcacat 3660 
aagagaaatc tgcatcaata gttggattga 3720 
cagttgaaca attaacgaac tgcattcctt 3780 
tgaagacatc cgatgttttc tgatagaagt 3 84 0 
aatccattgg tacacttcgt gcaaatgacc 3900 
caatatccga actatttcca gtgttagtag 3 960 
aaagtccagt gttgttgatc aaatttccgg 4020 
atgtgttcaa gtcaaagatc cattcatgat 4080 
atgatgagaa catgtaggtg tcatcgttgt 414 0 
tctctacttg aagcgagttg ttccagttag 42 00 
gataactggc aaagtctcct tgttgcatta 42 6 0 
tgccgttcac ataaattgga gcagtagttc 43 2 0 
tggaaaaatt ataggttatt caataaactt 4380 
tatgtgcttc aagtgcacct gaggtaatta 444 0 
ggtttgtggc atagcaaact aatccaactt 45 0 0 
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gaatcagtgt tttatctgtg atctcagcag 
tgaatgccca ttcttcacag ttttgagttt 
cgattaccat tccggttccg tcgacgctat 
tgtaactttt tgattaccga gtagtcataa 
gctagcgtgt tttccagggt atctagtgta 
tgaatagaat atgaataatt ccaaactcaa 
gaacattttt gtgacgtaag cagcccattc 
attgtctgga tcacttggta gcacgttgta 
caaattggca gctagatcag aagaaagagg 
taaagatccc gcaattgaaa ggagagagtt 
accaaagttt agtgtatgaa tacatcatta 
tttgtaagaa aatcttgagt attgctaaga 
ttcatcaaac tttgcggcac aatatctgga 
tttatttttt tgagttacca attttagttc 
cattctgaat tacatatcca ttcgttccat 
ttttctgtgt gtcagagaat tccattaaag 
tccattcgag actatactta caccgtatga 
gttatctgaa taccgtaacc tccaagcgcc 
atcaccagag ttgtcacaga gtttgacaaa 
tagccgctga catcgatgga tgacgctttg 
gatattaaca actcatcagc aaaagtctta 
ccttaatcaa tgattgcaaa tcatctttgg 
ttgagacagt ttgagataaa gcagagacag 
ccattcgcat tgaataactt cccgcgtttg 
ataaagcaag gttgacgtct gaaattttta 
cttactctca ataactatcg gcatctgaat 
atctgatgcc acaaccgtga agctgtcagc 
tacttcaaca aactttcctg ttccattcgt 
tgatgcagtt actactagtc ctctagatac 
tgttccagat tcaatctgaa taaaaattga 
ttaccaataa tatgtattga acttctgaga 
tctgacttgt ggatcctact gttgaattgt 
tatcattata actatattcc cctgattgaa 




tattcagagt tgaagcgggc gatggaagtt 4560 
ttcctacaat atttgatgca tcatcgataa 4 62 0 
tctaaaaatt tgattgacat tagtgtaaac 4680 
ggtaagttgc cagttgctat agctctagct 4740 
gatgcaagtt ggttggctag atttttggcc 4800 
aaagttttaa aaactcacga tgttcttctg 4860 
ttctgacgtc atttcctcca cgtacacaat 4920 
caaggaatca tagttatctg ttgcatattt 4980 
attgtcgaga gcaatcttca acgctgatgt 5040 
ggatgtctga aaattattat tatgacatct 5100 
atctctgcat ctgtcatagt cattccatta 5160 
atcgtttcaa tttgttgctg ggattcttga 522 0 
tatttaaaat gatttgtatg catttgtgta 5280 
ccgaaacatc gccttccgtc gaaacttgtc 534 0 
ttctaataat caaccctcca tcgattctca 5400 
ttctcaaaat taaagttcct atacaaaata 5460 
gtttgagaac cagatgtgta agttccggct 552 0 
aaagaaacaa tattgtaggt gggagaactg 5580 
taaagggcgg tggagtctac gaaaaatgag 5 64 0 
gagaagaata gctgcaagaa aagttatttt 570 0 
cagttcccga aacggtgaca agtgtttgtt 5760 
tgtacgaagc cgtcggtgaa agagtagata 5 82 0 
ctggaattcc aatgtcctgc ggagacagca 5 880 
agaacacaag tggagaagta gtttgattga 5 94 0 
gcgtcataac cagaccactc ccattgcata 600 0 
caacttctta accgtgctcg ctgaggatcc 6060 
attgtatgag aaaatggcat tatctgtgtc 612 0 
gcaagttatt gtgtaagtaa ctccaccata 6180 
aatattttga gtagaatgca gttttatcat 6240 
aaaatttcat gtgctctacg atttataaat 6300 
ctgggatatt gaacgaaaat gaagtgttct 6360 
cataagtgac attccaagca gttacataca 6420 
catctccaac tccactcatt gtagccatta 6480 
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tatttccaga aaccagtgat cgtttgtcat 
ttccatttgt tgcgaacatt ccttggattc 
gaacagtttc acataaacta caatgttcta 
ccatcaagct gtttataaga tgcccgggta 
atgctgagct gtaagttgaa aacaattaac 
ctaggcgata ccaatgtata tcccgatgca 
ccatcttttg attttgtact tgatactccc 
agatcgagag aagacagtac ggagttcaag 
agtttagtga taacttttgc catttcgtcg 
tcttgcaatg ttttcaaaac atcaacactt 
gtatttgaga gcaaactttg agcaacttcc 
tgattgagaa gagagctgct tgtgtttaga 
ccagccagtt ggttcattac atcagctttt 
gaaattggag aaactgttgc aagagagctg 
gcggtagcag agaatgcacc atttgcacca 
ccagatgagc tcttggttga aacaccagaa 
gtactaccac cgtctcctaa atgggatcct 
tctccattca atgccgttgt ttttccagag 
ccccctgacc cacttccaga agcggtcgtt 
gattgtccac ttccagatcc agaagtggtt 
gtcgcagatc ctcctgaacc tgttccacca 
ttgccaccgg catcatccga cgagctgaca 
ttatgaaaaa aaaacagtaa tgcgcttacc 
agtaaagatg tgactggcag attctccaga 
ataagttcca gttgcaggta caaagctggt 
tgcaccattt tttccaactg taatatagta 
agggagtgct ctttgtactc catggaaata 
atgtcatttt ctacacgatt ctgaaagaaa 
agtttgcgtt tgaaaatacc accctaaatt 
tcatgtgcaa cgggaaagcc aagtacactg 
tcataatttc ggtggtccag tggataacgg 
gctttcccgt tgcacatgaa atatgggtcg 
gtattttcaa acgcaaactt ttcacactaa 
atcgtgtaga aaatgatatc aactaccgga 




ctattttatc ctgctcaact tgcacagtcc 6540 
cgatggcagc tgctagcata gtgtccgcct 6600 
tattcaaaaa gtcttacagt aacggtatct 6660 
tctcctgtaa ggtatattgt agatccgtaa 6720 
tctcccaacc atcattttct taccgtacac 6780 
ataacatatc caaaaataac cgcgtaagtt 684 0 
aaggtataga ctgtacttcc tttgagagca 6900 
gattgtgcgg aagtcatatt gacatttgct 6960 
gccaattccg aatttgttgt tgcaatattg 7 02 0 
gacatatttc cgactccagg gattttgagg 7080 
actagatctg cggcaggtag agatgagatt 714 0 
gagttgttgg atgcagatcc atccattatc 7200 
tgagcatcta tgatcgcttg ttcagctgca 7260 
cgagttttag tagttcttgt agatggttgc 732 0 
gaggaatcgg aagatccaga cgtatcggag 73 80 
gatccatttg aatctgatcc tgagccagat 7440 
ggtgtggtag ctgttccaga tcctgtacca 7500 
tttaccccgt ccgaacccgt ccctgaagac 7560 
cctgatccac cagcccccga tcccgttgat 7 620 
gatctgactg catcacccgt gctaagagtt 7680 
gttcccccag tagctccagt tccaccggtt 7740 
gtggttggtg gggtttctaa aaattgaatt 7800 
agtagttgta gttggaagaa ctacattcat 7860 
tgcacgattg gtaacattaa ttaagaattc 7920 
cattgggttg aaagatacgg aggcactata 7980 
cataaatgaa aattgataac gttgctaact 804 0 
tgagtcggaa aactactttt cgggtagttt 8100 
atcctgccgg ttttgggttt tagtgtgaaa 816 0 
cagttgattt aacactacgc gacccatatt 8220 
aaaactcact ttcaaatttt caaagcaaag 82 8 0 
cgggagcggc gccagttttc agtgtacttg 834 0 
cgtagtgtta aatcaactga atttagggtg 8400 
aacccaaaac cggcaggatt ttctttcaga 846 0 
aaagtagttt tccgactcat atttccatgg 8520 
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agtacaaaaa gcactcccta gtaggcaaaa tctcacactc tgtcaagcaa ttgctttcct 8580 
tcagtaaaac aaatggctga gaagaatcgt ttcgacattg acaggtcata gcagtcttaa 8640 
aatattaagg ttttttttaa gtaagattga tttgaatatc ttaccgttaa agctccgggc 870 0 
tctttgtttg gaattggtgt aatataaaat ggattttcag agtaatacac tgtctcgttc 87 60 
catgtcaaat tttccaatac aaagtcgtaa ggggtttctg tacttgcgtc tggatccaca 8820 
gttgttgttc tcatgctatc agatgcactt gatgaatcgg atggtgtttg agataaacca 8 880 
gatgaatctg aagtggatgg agttgagtta gatgagtcta tggtggtgga atctgaagtt 8 94 0 
gtgccagaat ccgatgttga tgtagaactg tcttgtgaag catcagtcgt gctagcttcc 9000 
aaagtcgacg tagattcaat tcctgaagtt gtggaaatgg ctgaactttc ggaagaagat 9060 
ccagttgaag ttgtactggt aacttcggat gtacttgttg aatccgcaac aacattcaat 9120 
gtgaaaatgt ggctgaccac ttgcattgtc gtcaaatccg tcatgttgat tcgaaactca 9180 
taggtgccaa tgccaacaag aaatgtttca attggttgaa ttggaatatt agaagaataa 9240 
gttgcgttta ggaccgtgtt tgctggaaat tcgctttaat tcaataattt caaaaagttt 93 0 0 
gttaatccta cagtagttta agcaagtgga ttccttaatt ataagaaaag gctcagtaga 93 60 
cacatttctg cattcaaatg tttggcttct ctaaaatcat tcgtttcatt tggctcaacg 9420 
atttattaaa cccgcctcag taggagtaat ggcattagtt ggtaaaggta caatgttgat 9480 
gctgtcttca ttgtgacgtg tctcgttcca tgacaatcca ctgtccaaga tgaaatcgaa 9540 
ctgatcgact gacaaagtgg aagttgaatc agcagtggta gaactcggac tggaagaagt 9600 
tgtggatgtg tccgagatgg tagttgtact ggaatcagta gtactttctt cggaagttgt 96 60 
tgtggattct tgtaaagtag ttgtgctccc agccgacgtt gtcgtagaat cgcttgaacg 9720 
tgtggatgac ggctctgtga ctgtggatgt tgataatgtg gaagatggag ttgatggtgt 978 0 
agatgacgta gagcttgcta cagcagatgt agaagattcc gactctatgg tggtggaaga 9840 
atattcctga atataaacgt tcgcatacgt gtagtaaact tttttatcgt cggttgtcat 9900 
agttgctctg aaggtgtagt ttccaggacc aacgaatgtg ctagcagggt acgtccctcc 9960 
gagacgtggc attgatacac tttctgaata tattgaaaaa atatgtgtaa aaaatctaaa 10020 
taactatcgc ccgaaaaagg tttgcttttt ttccgatttg aagtttttat agaaacgttt 10080 
tcagaattaa agattttgcc tgtctctaat ttataataag tctttataaa caattactcg 10140 
tgaaacatgc tccgtcttta gtggttgaaa cataattaga agatgtaggg cttgtacatt 102 0 0 
cgatggatgt ttgatactga aatacagtgt tacatttgaa taatgcagtc ttcaatattg 10260 
tacttactcc aatgattccg aggccggaat taagcgttag gttcacactt gtggaatcat 10320 
agaacgttgt agttgccttt tctacaaaat agaagtctgg attagttccg tccgagctag 10380 
tagttgtctc agattttgtt gttgttgaac tttgctgagt agacgtggat gattgagtag 10440 
aagaagcggt gcttgacaca gacgaagaag ccgttgaact tggagtggaa gaagaactcg 1050 0 
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atggcccagt agttgatgtt gaaggagcag ttgtgctggt tgttactgta gacgaaggag 1056 0 
atgtcgatgt ggactcagtg cttgttgggg tcgttacagt agtcgaggag cttgaactag 1062 0 
atgtcaccgt agaagtcaca ggggatgttg acggggaagt agttacagta ctcgtcgatg 10680 
gttcggttgt tgacgttgaa gcagtcgagg tggtcaaagt agtggttggt tcagttgttg 10 74 0 
taacagtaga agatgtagat gtaacttctg ttgtggttgt ggaagtagat ggttcttcgg 10 800 
tagtagtgga tgtcaacata gtggtggtga aagtggtaga cgtagtggtc tcgtcaaggt 10860 
aggagcaaat cgcattatcc gggagagacg acaaagttgc tgaaaatttt cgttaaggat 10 920 
tttctggata actaacaatg cacaacaagg tgatcggtaa tagtgactgc tttgttttac 10 980 
cctgagcaaa ctgtaattgt ataaaatctg aaatattggc aatacaaacg ggtttgaaga 11040 
aaattattaa caattttatt cctgcctctc aatcataaca gcaaattctg gtttgcttgt 11100 
aattattatt gtgcgtccga aactcacatg tgatttcggg tgttgtagtg gttggaatac 11160 
ttgtactcag tgtggtggag ctcggcgagc ttgtgattgt gctagaactc tgctgagttg 11220 
tgctagttga tgatgtcgac gtggatgctg tggaagtgaa cgttgttgac gtggattcga 112 80 
tggtggtgga tgttgatgga gttgatgtgc ttgtgctcat tgcggtagtc accgtacttg 11340 
tacttgttgg gacggttgtt gtagatgtca cggtactagt cacggttgtg gtagttggag 114 0 0 
ttgtagtcga ggttgaagtt gactcaatag tgtagtcgca agtattatca ccctggaata 114 6 0 
aaatgaaagt aaacactatc tgagaaatcg tactcacagc gtctcgtttc attcttctca 1152 0 
aagtaggtga tccagaagtc ctcatgctaa actgttttgc cctgacaccc ttaagtacct 11580 
gcccatcata atacactcct tgactatcac tgatagctgt gctcttttca gaacgatctg 11640 
aaatactgtt tagccaatgt tcatgagcaa ttaagaactg acaaggtcgc ttgcacattc 117 00 
ttctcgcata ctcttcgttg atctctccgc tctcacactt ctctcggtag cccagcaacg 11760 
ccatctcgtt tccaccgact tggagccacc atagggagcc atcacatctg tcgataccgt 11820 
catctgagaa agagtttcta ttaaaatgtt agaaacacat agcactacat atgcaaataa 11880 
cgtttcacca gattcagaat gcgcaattca tgcctatctc atagcctacc tatgtgtcta 1194 0 
cctgagtatc tacttgagta ccttgcaaag aagattaatc ggcacaaacc aagtcaagac 12000 
tttgttggca taggtcttcc aggtgagtaa cgccgacatt atacataggt acgcacaaaa 12060 
ccttccccaa ataataatcc ttaccataac aaacttcata tttcgcctcc acagcaatac 12120 
tgatctcatc gtcatcattc acttcattca aagtaatcca agttgagttc aaaaagagtc 1218 0 
cgacaagcct ggtctctgtt tggatgcagt tgtgaatctg aataggaaca acaaggtttt 1224 0 
acaactaaaa aaatacacga ctaaccaatt ccaaacttga aacttccgta accttgttct 1230 0 
caactgaaag tctattcaat ccgcagctca atttgatttt aacgactcct tgtgaattcc 12360 
ttggaactcc tccaattgtt gtgtcatcgt tgtctaatcg aaaagttgcg atcccgtcaa 12420 
gaagttggta atgcaatcca tcaatttgta tcttaaaagt agttttattc agcttttcct 12480 
tctgagattt ttcactcacc gccgatattg ccagtagcaa tagaacaaag aagtttgact 12 54 0 
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tcttcatcca atgagctgga aggttatctt gtagaagttt tgtaaaaatt cgcctgaaaa 12600 
caaaaatgaa ttcagagcag aaaagacaac aactgaaaaa tgaagttgtc gaaaagcgaa 12 66 0 
aaggcgggct gaatcgaagg accat 12685 



<210> 4 
<211> 3178 
<212> PRT 

<213> C. Elegans Lov-1 protein 



<400> 4 

Met Val Leu Arg Phe Ser Pro Pro Phe Arg Phe Ser Thr Thr Ser Phe 
15 10 15 

Phe Ser Cys Cys Leu Phe Cys Ser Glu Phe lie Phe Val Phe Arg Arg 
20 25 30 

lie Phe Thr Lys Leu Leu Gin Asp Asn Leu Pro Ala His Trp Met Lys 
35 40 45 

Lys Ser Asn Phe Phe Val Leu Leu Leu Leu Ala lie Ser Ala He Gin 
50 55 60 

He Asp Gly Leu His Tyr Gin Leu Leu Asp Gly He Ala Thr Phe Arg 
65 70 75 80 

Leu Asp Asn Asp Asp Thr Thr He Gly Gly Val Pro Arg Asn Ser Gin 
85 90 95 

Gly Val Val Lys He Lys Leu Ser Cys Gly Leu Asn Arg Leu Ser Val 
100 105 110 

Glu Asn Lys Val Thr Glu Val Ser Ser Leu Glu Leu He His Asn Cys 
115 120 125 

He Gin Thr Glu Thr Arg Leu Val Gly Leu Phe Leu Asn Ser Thr Trp 
130 135 140 

He Thr Leu Asn Glu Val Asn Asp Asp Asp Glu He Ser He Ala Val 
145 150 155 160 

Glu Ala Lys Tyr Glu Val Cys Tyr Asp Asp Gly He Asp Arg Cys Asp 
165 170 175 

Gly Ser Leu Trp Trp Leu Gin Val Gly Gly Asn Glu Met Ala Leu Leu 
180 185 190 

Gly Tyr Arg Glu Lys Cys Glu Ser Gly Glu He Asn Glu Glu Tyr Ala 
195 200 205 

Arg Arg Met Cys Lys Arg Pro Tyr Arg Ser Glu Lys Ser Thr Ala He 
210 215 220 

Ser Asp Ser Gin Gly Val Tyr Tyr Asp Gly Gin Val Leu Lys Gly Val 
225 230 235 240 

Arg Ala Lys Gin Phe Ser Met Arg Thr Ser Gly Ser Pro Thr Leu Arg 
245 250 255 
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Arg Met Lys Arg Asp Ala Gly Asp Asn Thr Cys Asp Tyr Thr lie Glu 
260 265 270 

Ser Thr Ser Thr Ser Thr Thr Thr Pro Thr Thr Thr Thr Val Thr Ser 
275 280 285 

Thr Val Thr Ser Thr Thr Thr Val Pro Thr Ser Thr Ser Thr Val Thr 
290 295 300 

Thr Ala Met Ser Thr Ser Thr Ser Thr Pro Ser Thr Ser Thr Thr lie 
305 310 315 320 

Glu Ser Thr Ser Thr Thr Phe Thr Ser Thr Ala Ser Thr Ser Thr Ser 
325 330 335 

Ser Thr Ser Thr Thr Gin Gin Ser Ser Ser Thr lie Thr Ser Ser Pro 
340 345 350 

Ser Ser Thr Thr Leu Ser Thr Ser lie Pro Thr Thr Thr Thr Pro Glu 
355 360 365 

lie Thr Ser Thr Leu Ser Ser Leu Pro Asp Asn Ala lie Cys Ser Tyr 
370 375 380 

Leu Asp Glu Thr Thr Thr Ser Thr Thr Phe Thr Thr Thr Met Leu Thr 
385 390 395 400 

Ser Thr Thr Thr Glu Glu Pro Ser Thr Ser Thr Thr Thr Thr Glu Val 
405 410 415 

Thr Ser Thr Ser Ser Thr Val Thr Thr Thr Glu Pro Thr Thr Thr Leu 
420 425 430 

Thr Thr Ser Thr Ala Ser Thr Ser Thr Thr Glu Pro Ser Thr Ser Thr 
435 440 445 

Val Thr Thr Ser Pro Ser Thr Ser Pro Val Thr Ser Thr Val Thr Ser 
450 455 460 

Ser Ser Ser Ser Ser Thr Thr Val Thr Thr Pro Thr Ser Thr Glu Ser 
465 470 475 480 

Thr Ser Thr Ser Pro Ser Ser Thr Val Thr Thr Ser Thr Thr Ala Pro 
485 490 495 

Ser Thr Ser Thr Thr Gly Pro Ser Ser Ser Ser Ser Thr Pro Ser Ser 
500 505 510 

Thr Ala Ser Ser Ser Val Ser Ser Thr Ala Ser Ser Thr Gin Ser Ser 
515 520 525 

Thr Ser Thr Gin Gin Ser Ser Thr Thr Thr Lys Ser Glu Thr Thr Thr 
530 535 540 

Ser Ser Asp Gly Thr Asn Pro Asp Phe Tyr Phe Val Glu Lys Ala Thr 
545 550 555 560 

Thr Thr Phe Tyr Asp Ser Thr Ser Val Asn Leu Thr Leu Asn Ser Gly 
565 570 575 

Leu Gly lie lie Gly Tyr Gin Thr Ser lie Glu Cys Thr Ser Pro Thr 
580 585 590 

Ser Ser Asn Tyr Val Ser Thr Thr Lys Asp Gly Ala Cys Phe Thr Lys 
595 600 605 

Ser Val Ser Met Pro Arg Leu Gly Gly Thr Tyr Pro Ala Ser Thr Phe 
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Val Gly Pro Gly Asn Tyr Thr Phe Arg Ala Thr Met Thr Thr Asp Asp 
625 630 635 640 

Lys Lys Val Tyr Tyr Thr Tyr Ala Asn Val Tyr lie Gin Glu Tyr Ser 
645 650 655 

Ser Thr Thr lie Glu Ser Glu Ser Ser Thr Ser Ala Val Ala Ser Ser 
660 665 670 

Thr Ser Ser Thr Pro Ser Thr Pro Ser Ser Thr Leu Ser Thr Ser Thr 
675 680 685 

Val Thr Glu Pro Ser Ser Thr Arg Ser Ser Asp Ser Thr Thr Thr Ser 
690 695 700 

Ala Gly Ser Thr Thr Thr Leu Gin Glu Ser Thr Thr Thr Ser Glu Glu 
705 710 715 720 

Ser Thr Thr Asp Ser Ser Thr Thr Thr lie Ser Asp Thr Ser Thr Ser 
725 730 735 

Thr Ser Ser Pro Ser Ser Thr Thr Ala Asp Ser Thr Ser Thr Leu Ser 
740 745 750 

Val Asp Gin Phe Asp Phe lie Leu Asp Ser Gly Leu Ser Trp Asn Glu 
755 760 765 

Thr Arg His Asn Glu Asp Ser lie Asn lie Val Pro Leu Pro Thr Asn 
770 775 780 

Ala He Thr Pro Thr Glu Arg Ser Gin Thr Phe Glu Cys Arg Asn Val 
785 790 795 800 

Ser Thr Glu Pro Phe Leu He He Lys Glu Ser Thr Cys Leu Asn Tyr 
805 810 815 

Ser Asn Thr Val Leu Asn Ala Thr Tyr Ser Ser Asn He Pro He Gin 
820 825 830 

Pro He Glu Thr Phe Leu Val Gly lie Gly Thr Tyr Glu Phe Arg He 
835 840 845 

Asn Met Thr Asp Leu Thr Thr Met Gin Val Val Ser His He Phe Thr 
850 855 860 

Leu Asn Val Val Ala Asp Ser Thr Ser Thr Ser Glu Val Thr Ser Thr 
865 870 875 880 

Thr Ser Thr Gly Ser Ser Ser Glu Ser Ser Ala He Ser Thr Thr Ser 
885 890 895 

Gly He Glu Ser Thr Ser Thr Leu Glu Ala Ser Thr Thr Asp Ala Ser 
900 905 910 

Gin Asp Ser Ser Thr Ser Thr Ser Asp Ser Gly Thr Thr Ser Asp Ser 
915 920 925 

Thr Thr He Asp Ser Ser Asn Ser Thr Pro Ser Thr Ser Asp Ser Ser 
930 935 940 

Gly Leu Ser Gin Thr Pro Ser Asp Ser Ser Ser Ala Ser Asp Ser Met 
945 950 955 960 

Arg Thr Thr Thr Val Asp Pro Asp Ala Ser Thr Glu Thr Pro Tyr Asp 




965 



970 



975 



Phe Val Leu Glu Asn Leu Thr Trp Asn Glu Thr Val Tyr Tyr Ser Glu 
980 985 990 

Asn Pro Phe Tyr lie Thr Pro lie Pro Asn Lys Glu Pro Gly Ala Leu 
995 1000 1005 

Thr Thr Ala Met Thr Cys Gin Cys Arg Asn Asp Ser Ser Gin Pro Phe 
1010 1015 1020 

Val Leu Leu Lys Glu Ser Asn Cys Leu Thr Glu Phe Gly Lys Asn Gly 
L025 1030 1035 1040 

Ala Tyr Ser Ala Ser Val Ser Phe Asn Pro Met Thr Ser Phe Val Pro 
1045 1050 1055 

Ala Thr Gly Thr Tyr Glu Phe Leu lie Asn Val Thr Asn Arg Ala Ser 
1060 1065 1070 

Gly Glu Ser Ala Ser His lie Phe Thr Met Asn Val Val Leu Pro Thr 
1075 1080 1085 

Thr Thr Thr Glu Thr Pro Pro Thr Thr Val Ser Ser Ser Asp Asp Ala 
1090 1095 1100 

Gly Gly Lys Thr Gly Gly Thr Gly Ala Thr Gly Gly Thr Gly Gly Thr 
L105 1110 1115 1120 

Gly Ser Gly Gly Ser Ala Thr Thr Leu Ser Thr Gly Asp Ala Val Arg 
1125 1130 1135 

Ser Thr Thr Ser Gly Ser Gly Ser Gly Gin Ser Ser Thr Gly Ser Gly 
1140 1145 1150 

Ala Gly Gly Ser Gly Thr Thr Ala Ser Gly Ser Gly Ser Gly Gly Ser 
1155 1160 1165 

Ser Gly Thr Gly Ser Asp Gly Val Asn Ser Gly Lys Thr Thr Ala Leu 
1170 1175 1180 

Asn Gly Asp Gly Thr Gly Ser Gly Thr Ala Thr Thr Pro Gly Ser His 
1185 1190 1195 1200 

Leu Gly Asp Gly Gly Ser Thr Ser Gly Ser Gly Ser Asp Ser Asn Gly 
1205 1210 1215 

Ser Ser Gly Val Ser Thr Lys Ser Ser Ser Gly Ser Asp Thr Ser Gly 
1220 1225 1230 

Ser Ser Asp Ser Ser Gly Ala Asn Gly Ala Phe Ser Ala Thr Ala Gin 
1235 1240 1245 

Pro Ser Thr Arg Thr Thr Lys Thr Arg Ser Ser Leu Ala Thr Val Ser 
1250 1255 1260 

Pro lie Ser Ala Ala Glu Gin Ala lie He Asp Ala Gin Lys Ala Asp 
1265 1270 1275 1280 

Val Met Asn Gin Leu Ala Gly He Met Asp Gly Ser Ala Ser Asn Asn 
1285 1290 1295 

Ser Leu Asn Thr Ser Ser Ser Leu Leu Asn Gin He Ser Ser Leu Pro 
1300 1305 1310 

Ala Ala Asp Leu Val Glu Val Ala Gin Ser Leu Leu Ser Asn Thr Leu 
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Lys He Pro Gly Val Gly Asn Met Ser Ser Val Asp Val Leu Lys Thr 
1330 1335 1340 

Leu Gin Asp Asn He Ala Thr Thr Asn Ser Glu Leu Ala Asp Glu Met 
1345 1350 1355 1360 

Ala Lys Val He Thr Lys Leu Ala Asn Val Asn Met Thr Ser Ala Gin 
1365 1370 1375 

Ser Leu Asn Ser Val Leu Ser Ser Leu Asp Leu Ala Leu Lys Gly Ser 
1380 1385 1390 

Thr Val Tyr Thr Leu Gly Val Ser Ser Thr Lys Ser Lys Asp Gly Thr 
1395 1400 1405 

Tyr Ala Val He Phe Gly Tyr Val He Ala Ser Gly Tyr Thr Leu Val 
1410 1415 1420 

Ser Pro Arg Cys Thr Leu Ser He Tyr Gly Ser Thr He Tyr Leu Thr 
1425 1430 1435 1440 

Gly Asp Thr Arg Ala Ser Tyr Lys Gin Leu Asp Gly Asp Thr Val Thr 
1445 1450 1455 

Ala Asp Thr Met Leu Ala Ala Ala He Gly He Gin Gly Met Phe Ala 
1460 1465 1470 

Thr Asn Gly Arg Thr Val Gin Val Glu Gin Asp Lys He Asp Asp Lys 
1475 1480 1485 

Arg Ser Leu Val Ser Gly Asn He Met Ala Thr Met Ser Gly Val Gly 
1490 1495 1500 

Asp Val Gin Ser Gly Glu Tyr Ser Tyr Asn Asp Met Tyr Val Thr Ala 
1505 1510 1515 1520 

Trp Asn Val Thr Tyr Asp Asn Ser Thr Val Gly Ser Thr Ser Gin Lys 
1525 1530 1535 

Asn Thr Ser Phe Ser Phe Asn He Pro Val Ser Glu Val Gin Tyr He 
1540 1545 1550 

Leu Leu He Glu Ser Gly Thr Met He Lys Leu His Ser Thr Gin Asn 
1555 1560 1565 

He Val Ser Arg Gly Leu Val Val Thr Ala Ser Tyr Gly Gly Val Thr 
1570 1575 1580 

Tyr Thr He Thr Cys Thr Asn Gly Thr Gly Lys Phe Val Glu Val Asp 
1585 1590 1595 1600 

Thr Asp Asn Ala He Phe Ser Tyr Asn Ala Asp Ser Phe Thr Val Val 
1605 1610 1615 

Ala Ser Asp Gly Ser Ser Ala Ser Thr Val Lys Lys Leu He Gin Met 
1620 1625 1630 

Pro He Val He Glu Asn Val Asn Leu Ala Leu Phe Asn Gin Thr Thr 
1635 1640 1645 

Ser Pro Leu Val Phe Ser Asn Ala Gly Ser Tyr Ser Met Arg Met Val 
1650 1655 1660 

Leu Ser Pro Gin Asp He Gly He Pro Ala Val Ser Ala Leu Ser Gin 
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1665 



1670 



1675 



1680 



Thr Val Ser lie Ser Thr Leu Ser Pro Thr Ala Ser Tyr Thr Lys Asp 
1685 1690 1695 

Asp Leu Gin Ser Leu lie Lys Glu Gin Thr Leu Val Thr Val Ser Gly 
1700 1705 1710 

Thr Thr Ser Asn Ser Leu Leu Ser lie Ala Gly Ser Leu Thr Ser Ala 
1715 1720 1725 

Leu Lys lie Ala Leu Asp Asn Pro Leu Ser Ser Asp Leu Ala Ala Asn 
1730 1735 1740 

Leu Lys Tyr Ala Thr Asp Asn Tyr Asp Ser Leu Tyr Asn Val Leu Pro 
L745 1750 1755 1760 

Ser Asp Pro Asp Asn lie Val Tyr Val Glu Glu Met Thr Ser Glu Glu 
1765 1770 1775 

Trp Ala Ala Tyr Val Thr Lys Met Phe Gin Lys Asn lie Ala Lys Asn 
1780 1785 1790 

Leu Ala Asn Gin Leu Ala Ser Thr Leu Asp Thr Leu Glu Asn Thr Leu 
1795 1800 1805 

Ala Ala Arg Ala lie Ala Thr Gly Asn Leu Pro Tyr Asp Tyr Ser Asn 
1810 1815 1820 

Ser Val Asp Gly Thr Gly Met Val lie Val lie Asp Asp Ala Ser Asn 
1825 1830 1835 1840 

lie Val Gly Lys Thr Gin Asn Cys Glu Glu Trp Ala Phe Lys Leu Pro 
1845 1850 1855 

Ser Pro Ala Ser Thr Leu Asn Thr Ala Glu lie Thr Asp Lys Thr Leu 
1860 1865 1870 

lie Gin Val Gly Leu Val Cys Tyr Ala Thr Asn Pro Arg Thr Tyr Val 
1875 1880 1885 

Asp Asn Phe Asp Met Leu lie Thr Ser Gly Ala Leu Glu Ala His lie 
1890 1895 1900 

Lys Asp Glu Asn Gin He He He Pro He Thr Gly Thr Thr Ala Pro 
1905 1910 1915 1920 

He Tyr Val Asn Gly Arg Gly Ser Glu Asp Asp Ala Val Leu Thr Leu 
1925 1930 1935 

Met Gin Gin Gly Asp Phe Ala Ser Tyr Gin He Leu Asp Leu His Ala 
1940 1945 1950 

Phe Arg Thr Thr Asn Trp Asn Asn Ser Leu Gin Val Glu He He Ala 
1955 1960 1965 

Ser Gin Asp Tyr Glu He Pro Asn Asn Asp Asp Thr Tyr Met Phe Ser 
1970 1975 1980 

Ser Phe Gin Ser Leu Pro Gly Pro Leu Glu Ser Asn His Glu Trp He 
1985 1990 1995 2000 

Phe Asp Leu Asn Thr Leu Asn Lys Thr Ser Asn Tyr Phe Val Thr Ala 
2005 2010 2015 

Gly Asn Leu He Asn Asn Thr Gly Leu Phe Phe He Gly He Gly Lys 
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Arg Asn Ser Ser Thr Asn Thr Gly Asn Ser Ser Asp lie Val Asn Tyr 
2035 2040 2045 

Gly Gin Tyr Asp Ser Met Gin Trp Ser Phe Ala Arg Ser Val Pro Met 
2050 2055 2060 

Asp Tyr Gin Val Ala Ala Val Ser Lys Gly Cys Tyr Phe Tyr Gin Lys 
2065 2070 2075 2080 

Thr Ser Asp Val Phe Asn Ser Glu Gly Met Tyr Pro Ser Asp Gly Gin 
2085 2090 2095 

Gly Met Gin Phe Val Asn Cys Ser Thr Asp His Leu Thr Met Phe Ser 
2100 2105 2110 

Val Gly Ala Phe Asn Pro Thr lie Asp Ala Asp Phe Ser Tyr Asn Tyr 
2115 2120 2125 

Asn Val Asn Glu lie Glu Lys Asn Val Lys Val Met lie Ala Ala Val 
2130 2135 2140 

Phe Met Leu Val Val Tyr Gly Cys Leu Thr He Asn Ala He He Cys 
2145 2150 2155 2160 

Gin Arg Lys Asp Ala Ser Arg Gly Arg Leu Arg Phe Leu Lys Asp Asn 
2165 2170 2175 

Glu Pro His Asp Gly Tyr Met Tyr Val He Ala Val Glu Thr Gly Tyr 
2180 2185 2190 

Arg Met Phe Ala Thr Thr Asp Ser Thr He Cys Phe Asn Leu Ser Gly 
2195 2200 2205 

Asn Glu Gly Asp Gin He Phe Arg Ser Phe Arg Ser Glu Glu Asp Gly 
2210 2215 2220 

Asn Trp Glu Phe Pro Phe Ser Trp Gly Thr Thr Asp Arg Phe Val Met 
2225 2230 2235 2240 

Thr Thr Ala Phe Pro Leu Gly Glu Leu Glu Tyr Met Arg Leu Trp Leu 
2245 2250 2255 

Asp Asp Ala Gly Leu Asp His Arg Glu Ser Trp Tyr Cys Asn Arg He 
2260 2265 2270 

He Val Lys Asp Leu Gin Thr Gin Asp He Tyr Tyr Phe Pro Phe Asn 
2275 2280 2285 

Asn Trp Leu Gly Thr Lys Asn Gly Asp Gly Glu Thr Glu Arg Leu Ala 
2290 2295 2300 

Arg Val Glu Tyr Lys Arg Arg Phe Leu Asp Glu Ser Met Ser Met His 
2305 2310 2315 2320 

Met Leu Ala Gin Thr He Ser Trp Phe Ala Met Phe Thr Gly Gly Gly 
2325 2330 2335 

Asn Arg Leu Arg Asp Arg Val Ser Arg Gin Asp Tyr Ser Val Ser He 
2340 2345 2350 

He Phe Ser Leu Val Val Val Ser Met He Ser He Thr He Leu Lys 
2355 2360 2365 

Ser Asp Asn Ser He He Ser Asp Ser Lys Ser Val Ser Glu Phe Thr 
2370 2375 2380 
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Phe Thr lie Lys Asp lie Ala Phe Gly Val Gly Phe Gly Val Leu lie 
2385 2390 2395 2400 

Thr Phe Leu Asn Ser Leu His lie Leu Leu Cys Thr Lys Cys Arg Ser 
2405 2410 2415 

His Ser Glu His Tyr Tyr Tyr Lys Lys Arg Lys Arg Glu Asp Pro Glu 
2420 2425 2430 

Phe Lys Asp Asn Ser Gly Ser Trp Pro Met Phe Met Ala Gly Met Ala 
2435 2440 2445 

Arg Thr lie lie Val Phe Pro Val Leu Met Gly Leu lie Tyr lie Ser 
2450 2455 2460 

Gly Ala Gly Met Ser Leu Met Asp Asp Leu Ala Asn Ser Phe Tyr lie 
2465 2470 2475 2480 

Arg Phe Leu lie Ser Leu lie Leu Trp Ala Val Val Phe Glu Pro lie 
2485 2490 2495 

Lys Gly Leu lie Trp Ala Phe Leu lie Leu Lys Thr Arg Lys Ser His 
2500 2505 2510 

Lys lie lie Asn Lys Leu Glu Glu Ala Leu Leu Arg Ala Lys Pro Ala 
2515 2520 2525 

Glu Thr Phe Leu Arg Asn Pro Tyr Gly Lys lie Glu Lys Gly Leu Gly 
2530 2535 2540 

Thr Glu lie Ala Asp Val Thr Lys Leu Arg Asp Thr Glu Asn Arg Lys 
2545 2550 2555 2560 

Met Arg Asp Glu Gin Leu Phe lie Thr lie Arg Asp Met Leu Cys Phe 
2565 2570 2575 

Phe Ala Ser Leu Tyr lie Met Val Met Leu Thr Tyr Tyr Cys Lys Asp 
2580 2585 2590 

Arg His Gly Tyr Trp Tyr Gin Leu Glu Met Ser Thr lie Leu Asn lie 
2595 2600 2605 

Asn Gin Lys Asn Tyr Gly Asp Asn Thr Phe Met Ser lie Gin His Ala 
2610 2615 2620 

Asp Asp Phe Trp Asp Trp Ala Arg Glu Ser Leu Ala Thr Ala Leu Leu 
2625 2630 2635 2640 

Ala Ser Trp Tyr Asp Gly Asn Pro Ala Tyr Gly Met Arg Ala Tyr Met 
2645 2650 2655 

Asn Asp Lys Val Ser Arg Ser Met Gly lie Gly Thr lie Arg Gin Val 
2660 2665 2670 

Arg Thr Lys Lys Ser Ala Glu Cys Thr Met Phe Lys Gin Phe Gin Gly 
2675 2680 2685 

Tyr lie Asn Asp Cys Gly Glu Glu Leu Thr Ser Lys Asn Glu Glu Lys 
2690 2695 2700 

Thr Leu Tyr Met Gin Ala Gly Trp Thr Glu Leu Glu Ser Glu Asn Gly 
2705 2710 2715 2720 

Thr Asp Ala Ser Asp Glu Tyr Thr Tyr Lys Thr Ser Glu Glu Leu Ser 
2725 2730 2735 
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Thr Glu Thr Val Ser Gly Leu Leu Tyr Ser Tyr Ser Gly Gly Gly Tyr 
2740 2745 2750 

Thr He Ser Met Ser Gly Thr Gin Ala Glu He He Thr Leu Phe Asn 
2755 2760 2765 

Lys Leu Asp Ser Glu Arg Trp He Asp Asp His Thr Arg Ala Val He 
2770 2775 2780 

He Glu Phe Ser Ala Tyr Asn Ala Gin He Asn Tyr Phe Ser Val Val 
2785 2790 2795 2800 

Gin Leu Leu Val Glu He Pro Lys Ser Gly He Tyr Leu Pro Asn Ser 
2805 2810 2815 

Trp Val Glu Ser Val Arg Leu He Lys Ser Glu Gly Ser Asp Gly Thr 
2820 2825 2830 

Val Val Lys Tyr Tyr Glu Met Leu Tyr He Phe Phe Ser Val Leu He 
2835 2840 2845 

Phe Val Lys Glu He Val Phe Tyr Leu Tyr Gly Arg Tyr Lys Val He 
2850 2855 2860 

Thr Thr Met Lys Pro Thr Arg Asn Pro Phe Lys He Val Tyr Gin Leu 
2865 2870 2875 2880 

Ala Leu Gly Asn Phe Ser Pro Trp Asn Phe Met Asp Leu He Val Gly 
2885 2890 2895 

Ala Leu Ala Val Ala Ser Val Leu Ala Tyr Thr He Arg Gin Arg Thr 
2900 2905 2910 

Thr Asn Arg Ala Met Glu Asp Phe Asn Ala Asn Asn Gly Asn Ser Tyr 
2915 2920 2925 

He Asn Leu Thr Glu Gin Arg Asn Trp Glu He Val Phe Ser Tyr Cys 
2930 2935 2940 

Leu Ala Gly Ala Val Phe Phe Thr Ser Cys Lys Met He Arg He Leu 
2945 2950 2955 2960 

Arg Phe Asn Arg Arg He Gly Val Leu Ala Ala Thr Leu Asp Asn Ala 
2965 2970 2975 

Leu Gly Ala He Val Ser Phe Gly He Ala Phe Leu Phe Phe Ser Met 
2980 2985 2990 

Thr Phe Asn Ser Val Leu Tyr Ala Val Leu Gly Asn Lys Met Gly Gly 
2995 3000 3005 

Tyr Arg Ser Leu Met Ala Thr Phe Gin Thr Ala Leu Ala Gly Met Leu 
3010 3015 3020 

Gly Lys Leu Asp Val Thr Ser He Gin Pro He Ser Gin Phe Ala Phe 
3025 3030 3035 3040 

Val Val He Met Leu Tyr Met He Ala Gly Ser Lys Leu Val Leu Gin 
3045 3050 3055 

Leu Tyr Val Thr He He Met Phe Glu Phe Glu Glu He Arg Asn Asp 
3060 3065 3070 

Ser Glu Lys Gin Thr Asn Asp Tyr Glu He He Asp His He Lys Tyr 
3075 3080 3085 

Lys Thr Lys Arg Arg Leu Gly Leu Leu Glu Pro Lys Asp Phe Ala Pro 



-44- 




3090 3095 3100 

Val Ser He Ala Asp Thr Gin Lys Asp Phe Arg Leu Phe His Ser Ala 
3105 3110 3115 3120 

Val Ala Lys Val Asn Leu Leu His His Arg Ala Thr Arg Met Leu Gin 
3125 3130 3135 

Thr Gin Gly Gin Tyr Gin Asn Gin Thr Val He Asn Tyr Thr Leu Ser 
3140 3145 3150 

Tyr Asp Pro Val Ser Ala He His Glu Thr Gly Pro Lys Arg Phe Gin 
3155 3160 3165 

Lys Trp Arg Leu Asn Asp Val Glu Lys Asp 
3170 3175 

<210> 5 
<211> 8073 
<212> DNA 

<213> C. Elegans pkd-2 gene 
<400> 5 

tcattcttct tttttgtcag caatcgaggt gattgttgga cgacgagcgg cagattcacg 60 
gttacggact tggttggtga ggagggcctg gacaagtaaa atatttattg gaaatttaga 12 0 
tatttagcag taacagcaaa attatttgta ttttgttgtt taatttacta aatagtaaaa 18 0 
attgtaagtt ttcattaatt cttattgcca gaataaaaaa ttttctaatt ttgttttgtc 24 0 
taatttgtct aaaactacga aagtttttct ctaaaaattt cactagataa atacaatttt 300 
tcatgtttca attactttcc aaaagaagta acactataat tgcattagtt acaattttca 360 
actcacactc aaatccatca aatttcctcc atcttgttgt tgaactcttt gtttttcgat 42 0 
tgtctggagt gttgcattga ctccttcaat ccgatccaca atgctgaaca ctgattcttg 480 
catttgatct acacggcggt tcaaactgaa atgatttacg taatgtttat gatcatttat 54 0 
gatagagctg atacagtaaa agttaccaat ttttgtttct attcttcgga attgtgaaaa 60 0 
aatacaattt tctcatggtt ttcattattt gaaaattcca gtcttcacac gtataaactg 660 
gaacacgaaa aactatgggt tttattctag aatactaatt ttttaatcga taaataatat 72 0 
tatcgtcaaa aaagcataaa gttttttttg taagatatat gaaaatcgaa taacaaaagt 780 
taaacttaat caatttatga aaacattgaa ccagtcaaaa atctaattgt gataccgtga 840 
aaaaaaaacg tttccctcca aaagtttacc tttttcaagt cttctgttaa caaattttca 900 
gaacgtttat atttgtatgg tgacggtgaa acattatttg atcaaaactg ctgtgggaac 960 
tgacggttat tatataatta aggttattat ggtaacagtg aaacagtatt taaaaatagc 1020 
tgtttcggta ctcaaggggt atcccatgag gaaaataaaa gtattacttt ttcagttatg 1080 
aaaactgaga atgttttcac aaaatgttac ctgtggtctg tttgggaaaa aggaaatcta 1140 
cgatgagaaa tttgcagaac attttttgtc aaaattctct acatgttttt ttttgttgta 12 00 
cgcagcacag cggaagttca ggtggttatg aaagagtaaa tatttttttt ctgtgatata 1260 
aaaaatgttt gcctgtcttg acggctgcgg gccagcacat ttgcctacgt ttcaggtaaa 13 2 0 
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catgattttt gtaattttcc agtggcatgt 
gaccatttaa agttgtgtac acaataaaat 
attgaaatgc gaaccttcgg ttattatcga 
taaaaactag ctgaaacatc acaattttcc 
tctccatata atttcttttc tgttccgtca 
ccactttctc cggaacatgc tcagtcattg 
tgatgtcttt ttcagcgtat ccggcacggt 
catcctctcc aggggcatcc gggcgttttc 
atgaggtatt ttgtggtttt agccaagcgc 
gggcgttgat gaaccctgaa gcacccgaca 
tgtccgacgc tagtttaggt acaccaagta 
ctctccatct ttctttctag ccagctctgc 
agccaagaac atgttcaata ggatgaacga 
gggtccgaag aatcgattgc aggattctag 
acggagcagg gcgaacgcag agttgtagag 
tgaagcgccc gacatttacc gggttttgtg 
gaatgtaaca tgccgattag gatacttggt 
taaaagaaaa tttactgttt cattcaagtt 
gaagcgtatt taagattgtt aaaatagctg 
aaatgaatgt tgtaatggat cagacatctg 
tgtcggatgc atgaatgtaa aacgcatccg 
cttgataata tttaaaaatg aaaaaaaaac 
gctgcgaatt tcccgtttcg taaactttat 
cctaagaatg cattcgaatg gtagtaaaaa 
tgcgattttt ttctaatttt atatttttta 
ccccaactat tctaaactcc acgaataaaa 
cccaaagcac aaatatccaa actgtgcgaa 
aaatcctcca atatcctttg cagatctggt 
gttaactgag atgaacttga acactttcac 
gttcagataa gagttctcgg aagaagtgac 
ctcaatgaca gagttgacac gatttactcc 
tgtggctact gaaaatccta gcagcacaac 
gtggagacgg tgacggccga tagcaaaaag 




aggcccgcag gtaggcaggc ctaacaattt 13 80 
attaattctt taaaatataa tcatttgaaa 1440 
attgaatgaa aaacaaaaag aaaataattc 1500 
gtaaaactca ctttgcgtaa tccctgtgat 1560 
ttctagccac ctcatcagca atatcttcag 1620 
atgttacatt gaatcgagtg aatgcttcat 1680 
agagcatcag tttgtagtct tcatacgtgg 1740 
cacgttttgt gagtcctcga actttctgga 1800 
ccgacgtatt tcgggaactc ttagaatatg 1860 
tattccaggt ttcaacacaa acccagaaaa 192 0 
acttacattc ataaaccaat ccaaaatccc 1980 
tttcacttca acgtaggaat cattgatgat 2040 
gacgaagaag acgtaggcaa tgaagaaggc 210 0 
agccgagaag ttaaagtcac cgagaatgag 216 0 
gttggagtag tcggcgatct tagaaaaatg 2 22 0 
taggcaaaac ccggaagatg tcggatgcaa 2280 
atcactcagt cagataacca ccatttttgt 2340 
atataagtaa ttggaagatc ccgctgcggt 2400 
tgttgatatt tgggtacgtc aaaataaagg 246 0 
gcgggctcgg tgtaggcaga accaggcaga 2520 
acatctgccg ggttcttggt tcaaggtaag 2580 
accaggcaga tgtcgggtgc taaaataatt 2 640 
gagatggaaa tgaatcaaaa tgtcattgta 2 700 
taaatgtttt tcatataaaa ttggtgaaac 2760 
aatttcacag caatataaaa cgttacagta 2820 
caaagatctt aaagattaag ttacctgtgt 2880 
tgcaaaaaag aaaacagcga acatcactgc 2940 
caacgtagag gacaactgtg acatggtctt 3000 
ccaagcaaca aataccacac atgctttgat 3060 
gtcatcgaat ggtgcatttg tcaatccgtt 3120 
ggtttttgtg cgattcactg acagaattat 3180 
gtctaccaaa ttccagaact gggtgagata 324 0 
ctcctcgaaa atgaagtata gtatgaatcc 3300 
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acagaagatt ccttcaaaaa tcatcattcg 
atcgtaagtc ataagctttg gagttgtgat 
gagtctgaaa tgggaaattt cgaaaaaaat 
ttggcgcact tatttgaaaa ttattatctg 
aataaatttg ttgacgtaac taaagtttaa 
tcagtactct tgaaactaga caagatttat 
agttggccgg gtgactgata tgtatggccc 
ttttcggtgt ggcccattgc aaggcaaagc 
accatgcata tcagtcaaat gccactcgaa 
tgttgaaata cttataatta cagtttcaca 
caccattcca ccaaacatga tagaactcac 
ttgtacagag caaagtccac aataattgca 
tttaacgtgg caattgcaga ttgagcttca 
aatccaccac ctccatatga agcaatggtg 
gccgtggcgt agatgaatct gaatataata 
ttttttatag gaattatatg ttgacaataa 
gcatgacaac cataatgttc taaaatttaa 
gttagaatat ttcaattttc gaggtacttt 
ttaataagaa tactctttca ggtagttaat 
aagcttttgc aaaaacacat aaacagatat 
attttcatgc aaatttcgta acattctttc 
aattggggtt ttctcaatag tccataaaat 
tttccgcaaa taagtcaagt tttgcaataa 
gcatcacaat aagaaataat agtaaaaatt 
atgaaacaac gttagctccg cctttcacca 
tactagaagc tgctgattca acgaccaaac 
acgcatcaac acttccatca ccgaccatcg 
caaaacattc cttaatctcc cgctggaaac 
ctttcaacat tctgatccga ggttccccaa 
cgttatccgt cgaattggaa gtttccgtcc 
gccactgaaa gtttgatttg aaggttttca 
tattatccat tgaggtacaa gatccaaatg 
caaacaggtc gctcattact ttggagtagt 
caactgtaaa tttttgaatt tagaaaaaaa 




ggtgcctcca gatgtttgat aggtcagaag 3360 
aacaccgcca gatgcaggga gctcaaatag 3420 
ttaactcgct gcttcagctt tatcataaaa 3480 
atcgacattg attggaatgc aaatatttat 3540 
aaatccagtt taaaaaaact atgtaaaatt 3600 
acttgttttc atttccatag acaccctcac 3660 
gacatttttc gggttactgt ggattcatag 3720 
tagtgcggcg cgaaactcgg aaaacgtcgg 3780 
tttcgaaatt tttgaatgaa cgtttactct 3840 
aacattgtaa aattttagtc aaaaacgaga 3900 
ttcaccacac aaaacagatt aatattcgca 3960 
cgtgatcctc tgtcgatcca gcgattagcc 4 02 0 
gttgagccag ctactggaag gcgttgaaca 4080 
cccacggttt tcaggttttc aagctctttt 4140 
ttttatttaa aaaaaggatt ggtgagactg 4200 
ctatctaaga ctaacaatta aatgaaaatt 4260 
aaaaaggagc atgaaacatt acgaatatta 4320 
tcacaaactt tacatttttt tcaacgtttt 43 80 
atataagcta aattttgcat ttgtgtattg 4440 
aactgataat ttcttggaac ataaaattgt 4500 
aaatacagtt tcataatatt gttaaaaaga 4560 
tctaaatatt tttaaaataa aactaagtat 4620 
aatttactgt ttcacattat gatcaagttt 4680 
ggttctccat gaaaaaaccc cataaatgcc 4740 
atcgccgatt ggtcagcaga attcaaaagg 4 800 
ttggccgaat ttacaaaatt gacgtcactc 4860 
tcttatcctc gagcttttcc tcataatttg 4920 
ttttcatcac agtacacgag tcatttgtca 4980 
gcaaacgatt ctcatagtag atcatattct 5040 
aatatatgcc aggtattagg acttgtgaca 510 0 
tttaaaaatt gaggaaactt acatcccaaa 5160 
ctggagctcc ggaggcaccg gtgctcgcca 522 0 
agtaagattg gatgctgttt tgggcgaacg 52 80 
aaacccgtga agtgtcgggt gctaactggg 5340 
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cgtgctcgat atatcacagg attagcccga ctacctgcga ggtgtcgcgc gaaacactag 54 0 0 
atgaaaattt tacaagaaaa tgattttcga aaatacaaac atttgttaac attaattgta 5460 
tttttaagtt gtaaacgcaa aaataaatat tggaaatttg aaaatgtttt gttacaaaaa 552 0 
ttctgctgtt ttgcttacta agtaaaccta acaaattata ggtaaaaata gtatgtgaac 55 8 0 
gtttcatgag gttattcaag tagtgtcgga aaattaaaaa gtgtagaaaa attacgtcac 564 0 
aactgtatta aaatacataa aaacatgtat tttaatacat ttgtgacgtc acaaatgtat 57 00 
ttaaatacat tttgctacat tacttgatta accccattaa caaagttgta ctcgtaaaat 5760 
ttcagttgaa atgctcaaac tcactaaacg tgttgaggaa aaaaaataaa aatttaaaaa 5820 
aaaactgttc caccgttgta acaaatgttg tacgcgtttg tcttaaatag tattcggagg 5880 
attcagcctg caatggacag ttttcaaaag agaaaaattt aactaattgg aagccattta 5 94 0 
atcaaaaatt atgaatttag agattacttt gaaaaatgta tgattctaaa cgtttctttt 6000 
gtgtttattt gcaaaattca aatataagtt tttccacttt tcaaaaccta tttataaaaa 6060 
ttagaaaatt aaacaatttt ccaaacaaca ttttttcccg tactgcatta aagtaacaac 6120 
ataaattgga agattagtaa ctactttggt catagtgttt ccaacaaagt gtggttttta 6180 
tgatgctcac aataaatttt tcgaatgcca gttgaaacat ttttgaaaaa ttataaaaca 6240 
cgaaatgaat attttgcagt tgatagttac aaatccctgc caaatctttt ttttcacaaa 63 00 
cttgaatttt aagaaatttg ctaaaaaaaa acttcggctg tttcatacat gccatataat 6360 
ttgtaaaaat aaagtgaaaa tcgattcgtc gtgtgtagtt tcgccactca ctataaaatt 6420 
gctgattaag tatagtgagt ggcgaaactc ggaaattgtc ggccgccgtg gaaacctacc 6480 
ccaaaaccgg acgcagtgcg tccggtggtg ttaaaatcgg acgaccggac gccgatttgt 654 0 
acagccctat ttgaaagtaa tgacgtcata cttactttca tacagaaatt aaatatctga 66 0 0 
tacgttagat tttgggaaat aagcttgtca caaaaaatga tgtggtttat ttctagaagt 6660 
cttactatgt agttggtaca caaaatatga aatttgtagc gtatgcttca tagcagttac 672 0 
aaagtcgaga actatttgta cattaatttg accaacaaac ttaccataaa ccagcacaat 6780 
caagaacaca gcgtatccac caacttccat aaacgaacgg gcagtcagct tgatctttcc 6840 
atccgatttc tcgtgtccgg atgccagcaa ggcttgagaa aacgagattc cctctttttg 6 90 0 
agccggattc ttcttcttat cgtgctcata ctcctcactg accatagaat ggtcaaacga 696 0 
ggggccatgc tccgcagcgg cgaccggctg cggtggatta gcccatcgct cgtccgcagc 702 0 
gccgtagttc attgaagacg gctcgctgaa acagtagaaa atttgaatta aagttttgag 7080 
aaaagttgaa aatcgagagc tctgtagtgt aaaaactgga aaaatagagt cgaaaagagg 7140 
cgagctcgcg aaatccacgt cctcgtagct cttggagatg ccgcattgct aagagatttc 72 00 
cgtagatact atgttttatg ggatttcacg tttttggttg gagacggttt tttgcataga 7260 
aacggaaaaa tgatgcagga atagaaaacg aacatgattt gaaactgaaa accatcgact 7320 





• 






• 






atacggcaca 


atcatactac 


atttatcggg 


ttattgaaac 


tgcatcccaa 


aagtttacaa 


7380 


tttaaattca 


cataccattt 


gaagataaca 


acgaataaaa 


agacttcgaa 


aggcggcaaa 


7440 


tgtcgtggtt 


tcgtggtgta 


gtggttatca 


catctgtcta 


acacacagaa 


ggtcggtggt 


7500 


tcgagcccgc 


ccgagatcat 


aagttttttg 


tcaatcatta 


atattgattc 


atctgaatga 


7560 


aattgtaaaa 


ttctttgaag 


gtgttctaaa 


atattgaact 


gttttttttt 


agatttcgtt 


7620 


agtatataat 


ttttgaaaca 


tacatttttt 


tcttccaaat 


ttcaagtatc 


ttctacgatt 


7680 


tttgaaaaat 


cccaaaaatt 


gtaaacatta 


aaattctgaa 


taaacggtgg 


aaatttgtag 


7740 


ttctctcaaa 


ttctaaataa 


aaattgaacg 


aaatttgaga 


aatttcctgt 


ttcaaaaact 


7800 


aaatgtctta 


ttttcagagt 


tcaacaatgc 


cttagagaaa 


gttggaaaat 


gataatgttt 


7860 


gttagtatat 


tgagaatatc 


atgcaagtga 


aacaattagt 


ttttttttcg 


ataacaatta 


7920 


tttaaaaaaa 


actactgttt 


caaatctttt 


attcaaccaa 


tcctgtaata 


aaagttcact 


7980 


tatcttctcc 


ctcttcatcc 


ataatgtatg 


cccctcttca 


aatggaaaat 


atgatgtcgg 


8040 


ggggaggtcc 


tccccctccc 


cacgaccctc 


cat 






8073 



<210> 6 
<211> 815 
<212> PRT 

<213> C. Elegans Pkd-2 protein 



<400> 6 

Met Glu Gly Arg Gly Glu Gly Glu Asp Leu Pro Pro Thr Ser Tyr Phe 
15 10 15 

Pro Phe Glu Glu Gly His Thr Leu Trp Met Lys Arg Glu Lys lie Lys 
20 25 30 

His Leu Gin Arg lie Leu Gin Phe His Ser Asp Glu Ser lie Leu Met 
35 40 45 

lie Asp Lys Lys Leu Met He Ser Gly Gly Leu Glu Pro Pro Thr Phe 
50 55 60 

Cys Val Leu Asp Arg Cys Asp Asn His Tyr Thr Thr Lys Pro Arg His 
65 70 75 80 

Leu Pro Pro Phe Glu Val Phe Leu Phe Val Val He Phe Lys Cys Glu 
85 90 95 

Pro Ser Ser Met Asn Tyr Gly Ala Ala Asp Glu Arg Trp Ala Asn Pro 
100 105 HO 

Pro Gin Pro Val Ala Ala Ala Glu His Gly Pro Ser Phe Asp His Ser 
115 120 125 

Met Val Ser Glu Glu Tyr Glu His Asp Lys Lys Lys Asn Pro Ala Gin 
130 135 140 

Lys Glu Gly He Ser Phe Ser Gin Ala Leu Leu Ala Ser Gly His Glu 
145 150 155 160 

Lys Ser Asp Gly Lys He Lys Leu Thr Ala Arg Ser Phe Met Glu Val 
165 170 175 
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Gly Gly Tyr Ala Val Phe Leu He Val Leu Val Tyr Val Ala Phe Ala 
180 185 190 

Gin Asn Ser He Gin Ser Tyr Tyr Tyr Ser Lys Val Met Ser Asp Leu 
195 200 205 

Phe Val Ala Ser Thr Gly Ala Ser Gly Ala Pro Ala Phe Gly Ser Cys 
210 215 220 

Thr Ser Met Asp Asn He Trp Asp Trp Leu Ser Gin Val Leu He Pro 
225 230 235 240 

Gly He Tyr Trp Thr Glu Thr Ser Asn Ser Thr Asp Asn Glu Asn Met 
245 250 255 

He Tyr Tyr Glu Asn Arg Leu Leu Gly Glu Pro Arg He Arg Met Leu 
260 265 270 

Lys Val Thr Asn Asp Ser Cys Thr Val Met Lys Ser Phe Gin Arg Glu 
275 280 285 

lie Lys Glu Cys Phe Ala Asn Tyr Glu Glu Lys Leu Glu Asp Lys Thr 
290 295 300 

Met Val Gly Asp Gly Ser Val Asp Ala Phe He Tyr Ala Thr Ala Lys 
305 310 315 320 

Glu Leu Glu Asn Leu Lys Thr Val Gly Thr He Ala Ser Tyr Gly Gly 
325 330 335 

Gly Gly Phe Val Gin Arg Leu Pro Val Ala Gly Ser Thr Glu Ala Gin 
340 345 350 

Ser Ala He Ala Thr Leu Lys Ala Asn Arg Trp He Asp Arg Gly Ser 
355 360 365 

Arg Ala He He Val Asp Phe Ala Leu Tyr Asn Ala Asn He Asn Leu 
370 375 380 

Phe Cys Val Val Lys Leu Leu Phe Glu Leu Pro Ala Ser Gly Gly Val 
385 390 395 400 

He Thr Thr Pro Lys Leu Met Thr Tyr Asp Leu Leu Thr Tyr Gin Thr 
405 410 415 

Ser Gly Gly Thr Arg Met Met He Phe Glu Gly lie Phe Cys Gly Phe 
420 425 430 

He Leu Tyr Phe He Phe Glu Glu Leu Phe Ala He Gly Arg His Arg 
435 440 445 

Leu His Tyr Leu Thr Gin Phe Trp Asn Leu Val Asp Val Val Leu Leu 
450 455 460 

Gly Phe Ser Val Ala Thr He He Leu Ser Val Asn Arg Thr Lys Thr 
465 470 475 480 

Gly Val Asn Arg Val Asn Ser Val He Glu Asn Gly Leu Thr Asn Ala 
485 490 495 

Pro Phe Asp Asp Val Thr Ser Ser Glu Asn Ser Tyr Leu Asn He Lys 
500 505 510 

Ala Cys Val Val Phe Val Ala Trp Val Lys Val Phe Lys Phe He Ser 
515 520 525 

Val Asn Lys Thr Met Ser Gin Leu Ser Ser Thr Leu Thr Arg Ser Ala 
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530 



535 



540 



Lys Asp lie Gly Gly Phe Ala Val Met Phe Ala Val Phe Phe Phe Ala 
545 550 555 560 

Phe Ala Gin Phe Gly Tyr Leu Cys Phe Gly Thr Gin He Ala Asp Tyr 
565 570 575 

Ser Asn Leu Tyr Asn Ser Ala Phe Ala Leu Leu Arg Leu He Leu Gly 
580 585 590 

Asp Phe Asn Phe Ser Ala Leu Glu Ser Cys Asn Arg Phe Phe Gly Pro 
595 600 605 

Ala Phe Phe He Ala Tyr Val Phe Phe Val Ser Phe He Leu Leu Asn 
610 615 620 

Met Phe Leu Ala He He Asn Asp Ser Tyr Val Glu Val Lys Ala Glu 
625 630 635 640 

Leu Ala Arg Lys Lys Asp Gly Glu Gly He Leu Asp Trp Phe Met Asn 
645 650 655 

Lys Val Arg Gly Leu Thr Lys Arg Gly Lys Arg Pro Asp Ala Pro Gly 
660 665 670 

Glu Asp Ala Thr Tyr Glu Asp Tyr Lys Leu Met Leu Tyr Arg Ala Gly 
675 680 685 

Tyr Ala Glu Lys Asp He Asn Glu Ala Phe Thr Arg Phe Asn Val Thr 
690 695 700 

Ser Met Thr Glu His Val Pro Glu Lys Val Ala Glu Asp He Ala Asp 
705 710 715 720 

Glu Val Ala Arg Met Thr Glu Gin Lys Arg Asn Tyr Met Glu Asn His 
725 730 735 

Arg Asp Tyr Ala Asn Leu Asn Arg Arg Val Asp Gin Met Gin Glu Ser 
740 745 750 

Val Phe Ser He Val Asp Arg He Glu Gly Val Asn Ala Thr Leu Gin 
755 760 765 

Thr He Glu Lys Gin Arg Val Gin Gin Gin Asp Gly Gly Asn Leu Met 
770 775 780 

Asp Leu Ser Ala Leu Leu Thr Asn Gin Val Arg Asn Arg Glu Ser Ala 
785 790 795 800 

Ala Arg Arg Pro Thr He Thr Ser He Ala Asp Lys Lys Glu Glu 



<210> 7 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<223> Description of Artificial Sequence: Outside primer for PCR screening of 
lov-1 genomic {sy582) deletion 

<400> 7 

ctctatttgt ggttcgttgg eg 22 



805 



810 



815 
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<210> 8 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Outside primer for PCR screening of 
lov-1 genomic (sy582) deletion 

<400> 8 

gggagtttcc gttttcatgg gg 22 

<210> 9 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Nested primer for PCR screening of 
lov-1 genomic {sy582) deletion 

<400> 9 

ctaggaccga tgcaacagcg ag 22 

<210> 10 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Nested primer for PCR screening of 
lov-1 genomic (sy582) deletion 

<400> 10 

aacgctgatt ggttcaagtg tg 22 

<210> 11 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Outside primer for PCR screening of 
pkd-2 genomic (,sy606) deletion 

<400> 11 

cccctcgttt gaccattcta tgg 23 

<210> 12 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Outside primer for PCR screening of 
pkd-2 genomic {sy606} deletion 

<400> 12 

acgtgatcct ctgtcgatcc ag 22 
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<212> DNA 

<213> Artificial Sequence 



<223> Description of Artificial Sequence: Nested primer for PCR screening of 
pkd-2 genomic (sy606) deletion 

<400> 13 

agatcaagct gactgcccgt tc 2 2 

<210> 14 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Nested primer for PCR screening of 
pkd-2 genomic (sy606) deletion 

<400> 14 

gatccagcga ttagccttta acg 23 

<210> 15 

<211> 2870 
<212> PRT 

<213> C. Elegans Lov-1 sy582 deletion protein 
<400> 15 

Met Val Leu Arg Phe 
1 5 

Phe Ser Cys Cys Leu 
20 

lie Phe Thr Lys Leu 
35 

Lys Ser Asn Phe Phe 
50 

lie Asp Gly Leu His 
65 

Leu Asp Asn Asp Asp 
85 

Gly Val Val Lys lie 
100 

Glu Asn Lys Val Thr 
115 

lie Gin Thr Glu Thr 
130 

lie Thr Leu Asn Glu 
145 

Glu Ala Lys Tyr Glu 
165 

Gly Ser Leu Trp Trp 
180 

Gly Tyr Arg Glu Lys 
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Ser Pro Pro Phe Arg Phe Ser Thr Thr Ser Phe 
10 15 

Phe Cys Ser Glu Phe lie Phe Val Phe Arg Arg 
25 30 

Leu Gin Asp Asn Leu Pro Ala His Trp Met Lys 
40 45 

Val Leu Leu Leu Leu Ala lie Ser Ala lie Gin 
55 60 

Tyr Gin Leu Leu Asp Gly lie Ala Thr Phe Arg 



Thr Thr lie Gly Gly Val Pro Arg Asn Ser Gin 
90 95 

Lys Leu Ser Cys Gly Leu Asn Arg Leu Ser Val 
105 110 

Glu Val Ser Ser Leu Glu Leu lie His Asn Cys 
120 125 

Arg Leu Val Gly Leu Phe Leu Asn Ser Thr Trp 
135 140 

Val Asn Asp Asp Asp Glu lie Ser lie Ala Val 
150 155 160 

Val Cys Tyr Asp Asp Gly lie Asp Arg Cys Asp 
170 175 

Leu Gin Val Gly Gly Asn Glu Met Ala Leu Leu 
185 190 

Cys Glu Ser Gly Glu lie Asn Glu Glu Tyr Ala 



Arg Arg Met Cys Lys Arg Pro Tyr Arg Ser Glu Lys Ser Thr Ala lie 
210 215 220 

Ser Asp Ser Gin Gly Val Tyr Tyr Asp Gly Gin Val Leu Lys Gly Val 
225 230 235 240 

Arg Ala Lys Gin Phe Ser Met Arg Thr Ser Gly Ser Pro Thr Leu Arg 
245 250 255 

Arg Met Lys Arg Asp Ala Gly Asp Asn Thr Cys Asp Tyr Thr lie Glu 
260 265 270 

Ser Thr Ser Thr Ser Thr Thr Thr Pro Thr Thr Thr Thr Val Thr Ser 
275 280 285 

Thr Val Thr Ser Thr Thr Thr Val Pro Thr Ser Thr Ser Thr Val Thr 
290 295 300 

Thr Ala Met Ser Thr Ser Thr Ser Thr Pro Ser Thr Ser Thr Thr lie 
305 310 315 320 

Glu Ser Thr Ser Thr Thr Phe Thr Ser Thr Ala Ser Thr Ser Thr Ser 
325 330 335 

Ser Thr Ser Thr Thr Gin Gin Ser Ser Ser Thr lie Thr Ser Ser Pro 
340 345 350 

Ser Ser Thr Thr Leu Ser Thr Ser lie Pro Thr Thr Thr Thr Pro Glu 
355 360 365 

lie Thr Ser Thr Leu Ser Ser Leu Pro Asp Asn Ala He Cys Ser Tyr 
370 375 380 

Leu Asp Glu Thr Thr Thr Ser Thr Thr Phe Thr Thr Thr Met Leu Thr 
385 390 395 400 

Ser Thr Thr Thr Glu Glu Pro Ser Thr Ser Thr Thr Thr Thr Glu Val 
405 410 415 

Thr Ser Thr Ser Ser Thr Val Thr Thr Thr Glu Pro Thr Thr Thr Leu 
420 425 430 

Thr Thr Ser Thr Ala Ser Thr Ser Thr Thr Glu Pro Ser Thr Ser Thr 
435 440 445 

Val Thr Thr Ser Pro Ser Thr Ser Pro Val Thr Ser Thr Val Thr Ser 
450 455 460 

Ser Ser Ser Ser Ser Thr Thr Val Thr Thr Pro Thr Ser Thr Glu Ser 
465 470 475 480 

Thr Ser Thr Ser Pro Ser Ser Thr Val Thr Thr Ser Thr Thr Ala Pro 
485 490 495 

Ser Thr Ser Thr Thr Gly Pro Ser Ser Ser Ser Ser Thr Pro Ser Ser 
500 505 510 

Thr Ala Ser Ser Ser Val Ser Ser Thr Ala Ser Ser Thr Gin Ser Ser 
515 520 525 

Thr Ser Thr Gin Gin Ser Ser Thr Thr Thr Lys Ser Glu Thr Thr Thr 
530 535 540 

Ser Ser Asp Gly Thr Asn Pro Asp Phe Tyr Phe Val Glu Lys Ala Thr 
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Thr Thr Phe Tyr Asp Ser Thr Ser Val Asn Leu Thr Leu Asn Ser Gly 
565 570 575 

Leu Gly lie lie Gly Tyr Gin Thr Ser lie Glu Cys Thr Ser Pro Thr 
580 585 590 

Ser Ser Asn Tyr Val Ser Thr Thr Lys Asp Gly Ala Cys Phe Thr Lys 
595 600 605 

Ser Val Ser Met Pro Arg Leu Gly Gly Thr Tyr Pro Ala Ser Thr Phe 
610 . 615 620 

Val Gly Pro Gly Asn Tyr Thr Phe Arg Ala Thr Met Thr Thr Asp Asp 
625 630 635 640 

Lys Lys Val Tyr Tyr Thr Tyr Ala Asn Val Tyr lie Gin Glu Tyr Ser 
645 650 655 

Ser Thr Thr He Glu Ser Glu Ser Ser Thr Ser Ala Val Ala Ser Ser 
660 665 670 

Thr Ser Ser Thr Pro Ser Thr Pro Ser Ser Thr Leu Ser Thr Ser Thr 
675 680 685 

Val Thr Glu Pro Ser Ser Thr Arg Ser Ser Asp Ser Thr Thr Thr Ser 
690 695 700 

Ala Gly Ser Thr Thr Thr Leu Gin Glu Ser Thr Thr Thr Ser Glu Glu 
705 710 715 720 

Ser Thr Thr Asp Ser Ser Thr Thr Thr He Ser Asp Thr Ser Thr Ser 
725 730 735 

Thr Ser Ser Pro Ser Ser Thr Thr Ala Asp Ser Thr Ser Thr Leu Ser 
740 745 750 

Val Asp Gin Phe Asp Phe lie Leu Asp Ser Gly Leu Ser Trp Asn Glu 
755 760 765 

Thr Arg His Asn Glu Asp Ser He Asn He Val Pro Leu Pro Thr Asn 
770 775 780 

Ala He Thr Pro Thr Glu Arg Ser Gin Thr Phe Glu Cys Arg Asn Val 
785 790 795 800 

Ser Thr Glu Pro Phe Leu He He Lys Glu Ser Thr Cys Leu Asn Tyr 
805 810 815 

Ser Asn Thr Val Leu Asn Ala Thr Tyr Ser Ser Asn He Pro He Gin 
820 825 830 

Pro He Glu Thr Phe Leu Val Gly He Gly Thr Tyr Glu Phe Arg He 
835 840 845 

Asn Met Thr Asp Leu Thr Thr Met Gin Val Val Ser His He Phe Thr 
850 855 860 

Leu Asn Val Val Ala Asp Ser Thr Ser Thr Ser Glu Val Thr Ser Thr 
865 870 875 880 

Thr Ser Thr Gly Ser Ser Ser Glu Ser Ser Ala He Ser Thr Thr Ser 
885 890 895 

Gly He Glu Ser Thr Ser Thr Leu Glu Ala Ser Thr Thr Asp Ala Ser 



-55- 



Gin Asp Ser Ser Thr Ser Thr Ser Asp Ser Gly Thr Thr Ser Asp Ser 
915 920 925 

Thr Thr lie Asp Ser Ser Asn Ser Thr Pro Ser Thr Ser Asp Ser Ser 
930 935 940 

Gly Leu Ser Gin Thr Pro Ser Asp Ser Ser Ser Ala Ser Asp Ser Met 
945 950 955 960 

Arg Thr Thr Thr Val Asp Pro Asp Ala Ser Thr Glu Thr Pro Tyr Asp 
965 970 975 

Phe Val Leu Glu Asn Leu Thr Trp Asn Glu Thr Val Tyr Tyr Ser Glu 
980 985 990 

Asn Pro Phe Tyr lie Thr Pro lie Pro Asn Lys Glu Pro Gly Ala Leu 
995 1000 1005 

Thr Thr Ala Met Thr Cys Gin Cys Arg Asn Asp Ser Ser Gin Pro Phe 
1010 1015 1020 

Val Leu Leu Lys Glu Ser Asn Cys Leu Thr Glu Phe Gly Lys Asn Gly 
1025 1030 1035 1040 

Ala Tyr Ser Ala Ser Val Ser Phe Asn Pro Met Thr Ser Phe Val Pro 
1045 1050 1055 

Ala Thr Gly Thr Tyr Glu Phe Leu lie Asn Val Thr Asn Arg Ala Ser 
1060 1065 1070 

Gly Glu Ser Ala Ser His lie Phe Thr Met Asn Val Val Leu Pro Thr 
1075 1080 1085 

Thr Thr Thr Glu Thr Pro Pro Thr Thr Val Ser Ser Ser Asp Asp Ala 
1090 1095 1100 

Gly Gly Lys Thr Gly Gly Thr Gly Ala Thr Gly Gly Thr Gly Gly Thr 
1105 1110 1115 1120 

Gly Ser Gly Gly Ser Ala Thr Thr Leu Ser Thr Gly Asp Ala Val Arg 
1125 1130 1135 

Ser Thr Thr Ser Gly Ser Gly Ser Gly Gin Ser Ser Thr Gly Ser Gly 
1140 1145 1150 

Ala Gly Gly Ser Gly Thr Thr Ala Ser Gly Ser Gly Ser Gly Gly Ser 
1155 1160 1165 

Ser Gly Thr Gly Ser Asp Gly Val Asn Ser Gly Lys Thr Thr Ala Leu 
1170 1175 1180 

Asn Gly Asp Gly Thr Gly Ser Gly Thr Ala Thr Thr Pro Gly Ser His 
1185 1190 1195 1200 

Leu Gly Asp Gly Gly Ser Thr Ser Gly Ser Gly Ser Asp Ser Asn Gly 
1205 1210 1215 

Ser Ser Gly Val Ser Thr Lys Ser Ser Ser Gly Ser Asp Thr Ser Gly 
1220 1225 1230 

Ser Ser Asp Ser Ser Gly Ala Asn Gly Ala Phe Ser Ala Thr Ala Gin 
1235 1240 1245 

Pro Ser Thr Arg Thr Thr Lys Thr Arg Ser Ser Leu Ala Thr Val Ser 
1250 1255 1260 
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Pro He Ser Ala Ala Glu Gin Ala He He Asp Ala Gin Lys Ala Asp 
L265 1270 1275 1280 

Val Met Asn Gin Leu Ala Gly He Met Asp Gly Ser Ala Ser Asn Asn 
1285 1290 1295 

Ser Leu Asn Thr Ser Ser Ser Leu Leu Asn Gin He Ser Ser Leu Pro 
1300 1305 1310 

Ala Ala Asp Leu Val Glu Val Ala Gin Ser Leu Leu Ser Asn Thr Leu 
1315 1320 1325 

Lys He Pro Gly Val Gly Asn Met Ser Ser Val Asp Val Leu Lys Thr 
1330 1335 1340 

Leu Gin Asp Asn He Ala Thr Thr Asn Ser Glu Leu Ala Asp Glu Met 
L345 1350 1355 1360 

Ala Lys Val He Thr Lys Leu Ala Asn Val Asn Met Thr Ser Ala Gin 
1365 1370 1375 

Ser Leu Asn Ser Val Leu Ser Ser Leu Asp Leu Ala Leu Lys Gly Ser 
1380 1385 1390 

Thr Val Tyr Thr Leu Gly Val Ser Ser Thr Lys Ser Lys Asp Gly Thr 
1395 1400 1405 

Tyr Ala Val He Phe Gly Tyr Val He Ala Ser Gly Tyr Thr Leu Val 
1410 1415 1420 

Ser Pro Arg Cys Thr Leu Ser He Tyr Gly Ser Thr He Tyr Leu Thr 
1425 1430 1435 1440 

Gly Asp Thr Arg Ala Ser Tyr Lys Gin Leu Asp Gly Asp Thr Val Thr 
1445 1450 1455 

Ala Asp Thr Met Leu Ala Ala Ala He Gly He Gin Gly Met Phe Ala 
1460 1465 1470 

Thr Asn Gly Arg Thr Val Gin Val Glu Gin Asp Lys He Asp Asp Lys 
1475 1480 1485 

Arg Ser Leu Val Ser Gly Asn He Met Ala Thr Met Ser Gly Val Gly 
1490 1495 1500 

Asp Val Gin Ser Gly Glu Tyr Ser Tyr Asn Asp Met Tyr Val Thr Ala 
1505 1510 1515 1520 

Trp Asn Val Thr Tyr Asp Asn Ser Thr Val Gly Ser Thr Ser Gin Lys 
1525 1530 1535 

Asn Thr Ser Phe Ser Phe Asn He Pro Val Ser Glu Val Gin Tyr He 
1540 1545 1550 

Leu Leu He Glu Ser Gly Thr Met He Lys Leu His Ser Thr Gin Asn 
1555 1560 1565 

He Val Ser Arg Gly Leu Val Val Thr Ala Ser Tyr Gly Gly Val Thr 
1570 1575 1580 

Tyr Thr He Thr Cys Thr Asn Gly Thr Gly Lys Phe Val Glu Val Asp 
1585 1590 1595 1600 

Thr Asp Asn Ala He Phe Ser Tyr Asn Ala Asp Ser Phe Thr Val Val 



1605 



1610 



1615 



-57- 



Ala Ser Asp Gly Ser Ser Ala Ser Thr Val Lys Lys Leu lie Gin Met 
1620 1625 1630 

Pro lie Val lie Glu Asn Val Asn Leu Ala Leu Phe Asn Gin Thr Thr 
1635 1640 1645 

Ser Pro Leu Val Phe Ser Asn Ala Gly Ser Tyr Ser Met Arg Met Val 
1650 1655 1660 

Leu Ser Pro Gin Asp lie Gly lie Pro Ala Val Ser Ala Leu Ser Gin 
1665 1670 1675 1680 

Thr Val Ser lie Ser Thr Leu Ser Pro Thr Ala Ser Tyr Thr Lys Asp 
1685 1690 1695 

Asp Leu Gin Ser Leu lie Lys Glu Gin Thr Leu Val Thr Val Ser Gly 
1700 1705 1710 

Thr Thr Ser Asn Ser Leu Leu Ser lie Ala Gly Ser Leu Thr Ser Ala 
1715 1720 1725 

Leu Lys lie Ala Leu Asp Asn Pro Leu Ser Ser Asp Leu Ala Ala Asn 
1730 1735 1740 

Leu Lys Tyr Ala Thr Asp Asn Tyr Asp Ser Leu Tyr Asn Val Leu Pro 
1745 1750 1755 1760 

Ser Asp Pro Asp Asn lie Val Tyr Val Glu Glu Met Thr Ser Glu Glu 
1765 1770 1775 

Trp Ala Ala Tyr Val Thr Lys Met Phe Gin Lys Asn lie Ala Lys Asn 
1780 1785 1790 

Leu Ala Asn Gin Leu Ala Ser Thr Leu Asp Thr Leu Glu Asn Thr Leu 
1795 1800 1805 

Ala Ala Arg Ala He Ala Thr Gly Asn Leu Pro Tyr Asp Tyr Ser Asn 
1810 1815 1820 

Ser Val Asp Gly Thr Gly Met Val He Val lie Asp Asp Ala Ser Asn 
1825 1830 1835 1840 

He Val Gly Lys Thr Gin Asn Cys Glu Glu Trp Ala Phe Lys Leu Pro 
1845 1850 1855 

Ser Pro Ala Ser Thr Leu Asn Thr Ala Glu He Thr Asp Lys Thr Leu 
1860 1865 1870 

He Gin Val Gly Leu Val Cys Tyr Ala Thr Asn Pro Arg Thr Tyr Val 
1875 1880 1885 

Asp Asn Phe Asp Met Leu He Thr Ser Gly Ala Leu Glu Ala His He 
1890 1895 1900 

Lys Asp Glu Asn Gin He He He Pro He Thr Gly Thr Thr Ala Pro 
1905 1910 1915 1920 

He Tyr Val Asn Gly Arg Gly Ser Glu Asp Asp Ala Val Leu Thr Leu 
1925 1930 1935 

Met Gin Gin Gly Asp Phe Ala Ser Tyr Gin He Leu Asp Leu His Ala 
1940 1945 1950 

Phe Arg Thr Thr Asn Trp Asn Asn Ser Leu Gin Val Glu He He Ala 
1955 1960 1965 

Ser Gin Asp Tyr Glu He Pro Asn Asn Asp Asp Thr Tyr Met Phe Ser 





1970 



1975 



1980 



Ser Phe Gin Ser Leu Pro Gly Pro Leu Glu Ser Asn His Glu Trp lie 
.985 1990 1995 2000 

Phe Asp Leu Asn Thr Leu Asn Lys Thr Ser Asn Tyr Phe Val Thr Ala 
2005 2010 2015 

Gly Asn Leu lie Asn Asn Thr Gly Leu Phe Phe lie Gly lie Gly Lys 
2020 2025 2030 

Arg Asn Ser Ser Thr Asn Thr Gly Asn Ser Ser Asp lie Val Asn Tyr 
2035 2040 2045 

Gly Gin Tyr Asp Ser Met Gin Trp Ser Phe Ala Arg Ser Val Pro Met 
2050 2055 2060 

Asp Tyr Gin Val Ala Ala Val Ser Lys Gly Cys Tyr Phe Tyr Gin Lys 
2065 2070 2075 2080 

Thr Ser Asp Val Phe Asn Ser Glu Gly Met Tyr Pro Ser Asp Gly Gin 
2085 2090 2095 

Gly Met Gin Phe Val Asn Cys Ser Thr Asp His Leu Thr Met Phe Ser 
2100 2105 2110 

Val Gly Ala Phe Asn Pro Thr lie Asp Ala Asp Phe Ser Tyr Asn Tyr 
2115 2120 2125 

Asn Val Asn Glu lie Glu Lys Asn Val Lys Val Met lie Ala Ala Val 
2130 2135 2140 

Phe Met Leu Val Val Tyr Gly Cys Leu Thr lie Asn Ala lie lie Cys 
2145 2150 2155 2160 

Gin Arg Lys Asp Ala Ser Arg Gly Arg Leu Arg Phe Leu Lys Asp Asn 
2165 2170 2175 

Glu Pro His Asp Gly Tyr Met Tyr Val lie Ala Val Glu Thr Gly Tyr 
2180 2185 2190 

Arg Met Phe Ala Thr Thr Asp Ser Thr lie Cys Phe Asn Leu Ser Gly 
2195 2200 2205 

Asn Glu Gly Asp Gin lie Phe Arg Ser Phe Arg Ser Glu Glu Asp Gly 
2210 2215 2220 

Asn Trp Glu Phe Pro Phe Ser Trp Gly Thr Thr Asp Arg Phe Val Met 
2225 2230 2235 2240 

Thr Thr Ala Phe Pro Leu Gly Glu Leu Glu Tyr Met Arg Leu Trp Leu 
2245 2250 2255 

Asp Asp Ala Gly Leu Asp His Arg Glu Ser Trp Tyr Cys Asn Arg lie 
2260 2265 2270 

lie Val Lys Asp Leu Gin Thr Gin Asp lie Tyr Tyr Phe Pro Phe Asn 
2275 2280 2285 

Asn Trp Leu Gly Thr Lys Asn Gly Asp Gly Glu Thr Glu Arg Leu Ala 
2290 2295 2300 

Arg Val Glu Tyr Lys Arg Arg Phe Leu Asp Glu Ser Met Ser Met His 
2305 2310 2315 2320 

Met Leu Ala Gin Thr lie Ser Trp Phe Ala Met Phe Thr Gly Gly Gly 
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Asn Arg Leu Arg Asp Arg Val Ser Arg Gin Asp Tyr Ser Val Ser lie 
2340 2345 2350 

lie Phe Ser Leu Val Val Val Ser Met lie Ser lie Thr lie Leu Lys 
2355 2360 2365 

Ser Asp Asn Ser lie lie Ser Asp Ser Lys Ser Val Ser Glu Phe Thr 
2370 2375 2380 

Phe Thr lie Lys Asp lie Ala Phe Gly Val Gly Phe Gly Val Leu lie 
2385 2390 2395 2400 

Thr Phe Leu Asn Ser Leu His lie Leu Leu Cys Thr Lys Cys Arg Ser 
2405 2410 2415 

His Ser Glu His Tyr Tyr Tyr Lys Lys Arg Lys Arg Glu Asp Pro Glu 
2420 2425 2430 

Phe Lys Asp Asn Ser Gly Ser Trp Pro Met Phe Met Ala Gly Met Ala 
2435 2440 2445 

Arg Thr lie lie Val Phe Pro Val Leu Met Gly Leu lie Tyr lie Ser 
2450 2455 2460 

Gly Ala Gly Met Ser Leu Met Asp Asp Leu Ala Asn Ser Phe Tyr lie 
2465 2470 2475 2480 

Arg Phe Leu lie Ser Leu lie Leu Trp Ala Val Val Phe Glu Pro lie 
2485 2490 2495 



Lys lie lie Asn Lys Leu Glu Gly Ser Asp Gly Thr Val Val Lys Tyr 
2515 2520 2525 

Tyr Glu Met Leu Tyr lie Phe Phe Ser Val Leu lie Phe Val Lys Glu 
2530 2535 2540 

lie Val Phe Tyr Leu Tyr Gly Arg Tyr Lys Val lie Thr Thr Met Lys 
2545 2550 2555 2560 

Pro Thr Arg Asn Pro Phe Lys lie Val Tyr Gin Leu Ala Leu Gly Asn 
2565 2570 2575 

Phe Ser Pro Trp Asn Phe Met Asp Leu lie Val Gly Ala Leu Ala Val 
2580 2585 2590 

Ala Ser Val Leu Ala Tyr Thr lie Arg Gin Arg Thr Thr Asn Arg Ala 
2595 2600 2605 

Met Glu Asp Phe Asn Ala Asn Asn Gly Asn Ser Tyr lie Asn Leu Thr 
2610 2615 2620 

Glu Gin Arg Asn Trp Glu lie Val Phe Ser Tyr Cys Leu Ala Gly Ala 
2625 2630 2635 2640 

Val Phe Phe Thr Ser Cys Lys Met lie Arg lie Leu Arg Phe Asn Arg 
2645 2650 2655 

Arg lie Gly Val Leu Ala Ala Thr Leu Asp Asn Ala Leu Gly Ala lie 
2660 2665 2670 

Val Ser Phe Gly lie Ala Phe Leu Phe Phe Ser Met Thr Phe Asn Ser 
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Val Leu Tyr Ala Val Leu Gly Asn Lys Met Gly Gly Tyr Arg Ser Leu 
2690 2695 2700 

Met Ala Thr Phe Gin Thr Ala Leu Ala Gly Met Leu Gly Lys Leu Asp 
2705 2710 2715 2720 

Val Thr Ser lie Gin Pro lie Ser Gin Phe Ala Phe Val Val lie Met 
2725 2730 2735 

Leu Tyr Met lie Ala Gly Ser Lys Leu Val Leu Gin Leu Tyr Val Thr 
2740 2745 2750 

lie lie Met Phe Glu Phe Glu Glu lie Arg Asn Asp Ser Glu Lys Gin 
2755 2760 2765 

Thr Asn Asp Tyr Glu lie lie Asp His lie Lys Tyr Lys Thr Lys Arg 
2770 2775 2780 

Arg Leu Gly Leu Leu Glu Pro Lys Asp Phe Ala Pro Val Ser lie Ala 
2785 2790 2795 2800 

Asp Thr Gin Lys Asp Phe Arg Leu Phe His Ser Ala Val Ala Lys Val 
2805 2810 2815 

Asn Leu Leu His His Arg Ala Thr Arg Met Leu Gin Thr Gin Gly Gin 
2820 2825 2830 

Tyr Gin Asn Gin Thr Val lie Asn Tyr Thr Leu Ser Tyr Asp Pro Val 
2835 2840 2845 

Ser Ala lie His Glu Thr Gly Pro Lys Arg Phe Gin Lys Trp Arg Leu 
2850 2855 2860 

Asn Asp Val Glu Lys Asp 
2865 2870 

<210> 16 
<211> 200 
<212> PRT 

<213> C. Elegans Pkd-2 deletion mutant [sy606) protein 
<400> 16 

Met Glu Gly Arg Gly Glu Gly Glu Asp Leu Pro Pro Thr Ser Tyr Phe 
15 10 15 

Pro Phe Glu Glu Gly His Thr Leu Trp Met Lys Arg Glu Lys lie Lys 
20 25 30 

His Leu Gin Arg lie Leu Gin Phe His Ser Asp Glu Ser lie Leu Met 
35 40 45 

lie Asp Lys Lys Leu Met lie Ser Gly Gly Leu Glu Pro Pro Thr Phe 
50 55 60 

Cys Val Leu Asp Arg Cys Asp Asn His Tyr Thr Thr Lys Pro Arg His 
65 70 75 80 

Leu Pro Pro Phe Glu Val Phe Leu Phe Val Val lie Phe Lys Cys Glu 
85 90 95 

Pro Ser Ser Met Asn Tyr Gly Ala Ala Asp Glu Arg Trp Ala Asn Pro 
100 105 110 
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Pro Gin Pro Val Ala Ala Ala Glu 
115 120 

Met Val Ser Glu Glu Tyr Glu His 
130 135 

Lys Glu Gly lie Ser Phe Ser Gin 
145 150 

Lys Ser Asp Gly Lys lie Lys Leu 
1S5 

Gly Gly Tyr Ala Val Phe Leu lie 
180 

Pro Arg Gin Lys Ser Leu Lys Thr 

195 200 



# 



His Gly Pro Ser Phe Asp His Ser 
125 

Asp Lys Lys Lys Asn Pro Ala Gin 
140 

Ala Leu Leu Ala Ser Gly His Glu 
155 160 

Thr Ala Arg Ser Phe Met Glu Val 
170 175 

Val Leu Val Tyr Asp Ser Ser Thr 
185 190 
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